A Null Space Sensitivity Analysis for Hydrological Data Assimilation with Ensemble Methods

Martin, Nick; White, Jeremy; Southard, Paul

doi:10.3390/hydrology12050106

Open AccessArticle

A Null Space Sensitivity Analysis for Hydrological Data Assimilation with Ensemble Methods

by

Nick Martin

^1,*

,

Jeremy White

²

and

Paul Southard

³

¹

Vodanube, LLC, Fort Collins, CO 80525, USA

²

INTERA Incorporated, Fort Collins, CO 80525, USA

³

Hydrogeology, Department of Environmental Sciences, University of Basel, 4056 Basel, Switzerland

^*

Author to whom correspondence should be addressed.

Hydrology 2025, 12(5), 106; https://doi.org/10.3390/hydrology12050106

Submission received: 18 March 2025 / Revised: 18 April 2025 / Accepted: 24 April 2025 / Published: 28 April 2025

(This article belongs to the Section Hydrological and Hydrodynamic Processes and Modelling)

Download

Browse Figures

Versions Notes

Abstract

:

Predictive uncertainty analysis focuses on defensible variability in model projected values after estimation of the posterior parameter distribution. Inverse-style parameter estimation selects posterior parameters through history matching where parameters are varied and resulting model simulation values are compared to observations, and parameters are selected balancing goodness-of-fit between simulated and observed values and expert knowledge. When inverse-style parameter estimation approaches are used, parameter sensitivity, which is the change in simulated outputs relative to the change in parameter values, is an important consideration. Variation in null space parameters has a limited impact on history matching skill; however, these parameters become important when they impact predictions. A new null space sensitivity analysis for ensemble methods of data assimilation (DA) using observation error models is developed and implemented for an integrated hydrological model. Empirical parameter sensitivity is estimated by comparing the spreads of prior and posterior parameter distributions. Sensitivity analysis is generated by an ensemble of models with insensitive parameters varying across the prior parameter distribution and sensitive parameters fixed to best-fit model values. The result is identification of insensitive aquifer storage parameters that change storage-related model predictions by as much as two times. This null space analysis describes uncertainty from data insufficiency. Ensemble methods using observation error models also describe predictive uncertainty from noisy measurements and imperfect models.

Keywords:

uncertainty analysis; data assimilation (DA); ensemble methods; observation error model; null space; parameter sensitivity; bias–variance tradeoff; model representation error; PEST++ iterative Ensemble Smoother (iES); integrated hydrological model

1. Introduction

Data assimilation (DA) is a collection of techniques providing an optimal combination of imperfect model simulation results with noisy observations. It produces optimal projections of model state and inverse-style estimation of model parameters given available measurements. DA methods inherently describe uncertainty, or lack of knowledge, related to the combination of models, and observations and can explicitly include observation error models. Observation error models quantitatively describe uncertainty from observation error and model representation error [1].

Ensemble methods are a class of DA algorithms that represent relevant covariances with one or more finite ensembles. In the case of state estimation, the ensemble is typically obtained from empirical realizations, or samples, of simulated system states using Monte Carlo methods. In the inverse problem setting, the ensembles are of parameter realizations and the resulting model outputs. Commonly used ensemble methods include the ensemble Kalman filter (EnKF) for state estimation and the iterative Ensemble Smoother (iES) for parameter estimation [1,2].

iES is the ensemble method used here for DA-based parameter estimation for an integrated hydrological model, where input and observation noise uncertainty are explicitly considered. iES provides an ensemble approximation to the linear operator, i.e., the Jacobian matrix, used for inverse-style parameter estimation where each ensemble is a collection of parameter values. This ensemble approximation allows for dimensionality reduction and consequently reduced computational constraints for highly parameterized models, as well as some abilities with which to cope with non-linearity in the parameter–output relation [2].

Inverse-style parameter estimation involves varying parameter values across a predetermined range of feasible values from professional judgment in order to find which parameter values provide the closest match between simulated and observed values. This feasible range is the prior in a Bayesian sense. History matching is the assimilation of historic observations and is undertaken with iES by comparing simulated and historically observed values via a goodness-of-fit metric and adjusting parameters to minimize this metric. If the inverse problem is ill posed (that is, there are information deficits), then there will be parameters and parameter combinations that do not impact history matching. Parameters and parameter combinations that do not change the goodness-of-fit between simulation results and observations are in the null space of the linearized inverse problem [3,4].

By definition, null space parameters can be set to any value without changing the history matching results and are informed only by professional judgment. In many applied groundwater modeling settings, null space parameters are important when changes to predicted and future system states are caused by variations in these insensitive parameters. Predictive uncertainty analysis, including considerations of null space parameters, may be undertaken after assimilation of models and observations and typically employs a form of Monte Carlo sampling from the identified insensitive range for null space parameters, such that the realizations used in the Monte Carlo representation maintain an acceptable fit to historic observations (i.e., they are approximate posterior samples) [3,4].

Null space-related predictive uncertainty analyses are commonly employed science and engineering techniques. For example, Ref. [5] uses these techniques for the optimal design of chemistry experiments and selection of control variables, and Ref. [6] employs them for fine-resolution image reconstruction from truncated data collection. Null space Monte Carlo (NSMC) predictive uncertainty descriptions have been used successfully for a variety of fluid movement-related analyses [3,4]. NSMC has provided predictive uncertainty analysis of highly parameterized groundwater flow and transport models in complex subsurface environments [3,7,8,9,10,11,12,13] and for integrated hydrological models [14]. The ensemble approximation means that the NSMC method cannot be used directly with ensemble methods because the empirical parameter-to-observation relation is usually rank-deficient and noisy. However, null space considerations have been incorporated as constraints in EnKF-style approaches [15,16].

We present and implement a new null space sensitivity analysis for ensemble methods for hydrological DA that supports several types of observation error models. Conceptually, this new approach is related to NSMC but focuses on sampling and exploring the prediction sensitive null space parameters in an effort to demonstrate the effect of these important parameters to decision makers. Incorporation of observation error models means that uncertainty from observation error and, by extension, some forms of model representation error are explicitly included in the predictive uncertainty analysis. Because observation error and model representation error are descriptions of the specific measurements and forward models as implemented for a particular study site, the new analysis provides a framework that can be applied generally to guide and assess the prediction of system states in many common environmental modeling settings. However, the details of similar null space sensitivity analyses will need to be derived uniquely based on the characteristics of the study site, the predictive outcomes of interest, and the hydrological models used in assimilation. Suggestions are provided for using predicted system states to support decision making, despite the uncertainty identified from the null space sensitivity, observation error, and model representation error.

2. Data and Methods

DA is an optimal combination of model simulation results with observations. DA-based implementations are applied to a study area, and site-specific observations and model representations provide the custom framework for assimilation. Section 2.1 presents the study site, models, and observations used in the example DA implementation. Section 2.2 identifies categories of forward models for which DA provides beneficial decision support. DA methods, including ensemble methods, observation error models, and the bias–variance tradeoff, are discussed in Section 2.3. Finally, Section 2.4 presents the conceptual and theoretical development of the empirical framework for null space sensitivity analysis with ensemble methods.

2.1. Study Site, Observations, and Models

The new research presented herein is an empirical null space sensitivity analysis that is implemented and applied after DA. This new analysis was developed and applied during a multiyear research project called the Blanco River Aquifers Assessment Tool (BRAAT) Project. Herein, a brief summary of the BRAAT project is provided, as it is only used to demonstrate the novel null space sensitivity analysis for ensemble methods employing observation error models.

The BRAAT project had two phases: (1) conceptualization and (2) integrated hydrological model development and implementation. The “Conceptual Model Report” [17] documents the site and process conceptualization for the first phase. Appendix A.1 Table A1 provides access to the Conceptual Model Report and to the source code, model files, and other documents detailing the second phase. Data sources used in the BRAAT project are listed in Table A2 and Table A3.

Additional summary information on the second phase of the BRAAT project is available in Appendix A.2. Figure 1 identifies the study area and important hydrologic features. Figure A1 shows the topography and geological structures in the area. The primary purpose of BRAAT is to examine two primary threats to future availability of water resources in the study area.

Changing weather patterns, particularly increased drought frequency and magnitude, resulting from global climate change;
Increased water demands in this rapidly growing and developing portion of the San Antonio–Austin corridor.

Two types of state observations provide information for assimilation through history matching: (1) water level elevations in wells and (2) stream discharge from gauging stations. Figure 2 identifies locations for dynamic state observations, and Appendix A.2.1 provides summary information concerning available observations. Significant uncertainty and observation errors are expected in relation to the available state observations, as described in the following.

Water elevations observed in wells are available upon request from the agencies listed in Table A3. Expectations for uncertainty and observation error are generated as follows.
–
At the time of data processing for this study in 2021, there were no dedicated monitoring well or piezometer observations available in the study area. All water level observations came from wells that are, or may be, pumped with some unknown frequency.
–
Most of the agencies listed in Table A3 do not have the resources and funding to collect location surveys, which are stamped by a registered surveyor, and generate scientific data products that include cleaned, filtered, and verified recodes based on published data standards in addition to raw field observations.
–
Many water level observations are available as point-in-time observations that occur sporadically with a month or more separating measurements.
–
Figure 2 shows the locations of 96 wells with water level observations within the study area and identifies sparse spatial coverage for these well-based observations. The Middle Trinity Aquifer has by far the largest number of wells in the focus area, with 53. The focus area is the “Integrated Focus Area” in Figure 2. The other three aquifers examined (the Balcones Fault Zone (BFZ) Edwards Aquifer, Upper Trinity Aquifer, and Lower Trinity Aquifer) have three wells each within the focus area. For the Middle Trinity Aquifer, 53 wells in the focus area correspond to an average density of one observation location per 40.5 km².
Stream discharge estimates are available from the data portals identified in Table A2. Ref. [19] provides a detailed uncertainty analysis for the BRAAT project and the 16 gauging stations shown in Figure 2. The limitations of these data sets are summarized below.
–
Water stage recorders, which measure the static water level in the stream or river, and a rating curve are used to calculate discharge at each of these stations. Expected measurement errors for discharge calculated using a rating curve are ±50–100% for low flows, ±10–20% for medium or high in-bank flows, and ±40% for out-of-bank flows [20]. A rating curve provides an empirical and poor-quality hydrodynamics model [19].
–
Only 2 out of 16 stations have Water Year (WY) 2021 data quality assessed as “good”. The other 14 stations have “fair” to “poor” data quality assessments [19].
–
Low flows are important in the study area and occur more frequently than typically expected because of direct and fast communication between surface waters and groundwater in this karst terrain [17,21,22,23,24]. Lows flows provide the largest relative measurement error expectation of ±100% [20].

The Conceptual Model Report identified that an integrated hydrological modeling approach was required to describe and predict integrated hydrological water budgets for decision support related to climate change and increased economic development. Water budget basins for surface water accounting and water budget regions for groundwater accounting are shown in Figure 3. Table A4 provides labels for the water budget regions. Project stakeholders selected a custom integrated hydrological modeling implementation, which used the three open-source modeling components listed below and is schematically described in Figure A2. Additional information concerning custom integrated hydrological model development and application is available in the Appendix A.2.2, Appendix A.2.3 and Appendix A.2.4.

The Hydrological Simulation Program–FORTRAN (HSPF) represents water movement and storage in watersheds, rivers, and creeks.
–
The “Blanco River Water Budget Basin” and “Onion Creek Water Budget Basin” in Figure 3 are the budget accounting areas for surface water and the HSPF model.
MODFLOW 6 simulates groundwater movement and storage.
–
The “Water Budget Regions” identified in Figure 3 and labeled in Table A4 are the accounting volumes for groundwater.
The Unsaturated Zone Flow (UZF) advanced stress package from MODFLOW 6 links HSPF and MODFLOW 6 and represents variably saturated zone considerations.
–
Because UZF links groundwater with surface water, it contributes to water budget accounting in both water budget basins and water budget regions.

2.2. Hydrological and Hydrodynamical Model Representations

Figure 4 gives a loosely defined demarcation between hydrological and hydrodynamical models based on the amount of classical physics versus empirical definitions within the algorithms that make up the model. Here, hydrodynamical models employ primarily classical physics-based descriptions of the forces driving water movement. Direct numerical simulation (DNS) [25] and Reynolds-averaged Navier–Stokes equations (RANS) are the hydrodynamical models shown in Figure 4.

DNS is a computational fluid dynamics (CFD) simulation that directly provides the solutions to the governing physical and mathematical equations without simplifications or sub-models, like a turbulence model. Because there is no simplification, or regularization, the simulation domain must be physically resolved at high spatial resolution (<than micrometers) and with short time-integration steps. As a result, applying DNS for most “real” problems is not feasible due to the resolution requirements [26]. Reynolds-averaged Navier–Stokes equations (RANS) employ regularization through time- and space-based averaging in conjunction with a turbulence sub-model, which attempts to simulate turbulent structures with respect to time and location [27,28]. For the purposes of this paper, hydrodynamical models contain sufficient physics and rigorous mathematics to allow theoretical derivation of stability and accuracy through comparison of discrete representations within the model to the governing equations provided by classical physics. For example, Ref. [29] provides a stability and accuracy analysis of the RANS Tidal, Residual, Intertidal Mudflat (TRIM) Model [30].

Hydrological representations employ empirical relationships and make simplifying assumptions to explicitly represent (only) a subset of the forces driving water movement. HSPF and MODFLOW 6 are examples of the hydrological category, have important empirical components, and only represent a subset of fluid movement forces, compared to RANS and DNS. HSPF utilizes hydrologic response units (HRUs) and well-mixed reservoirs to create a network of ordinary differential equations (ODEs) and empirical relationships. Continuity, or mass conservation, is represented in terms of each ODE within the network, and this network is an example of hydrologic routing [31]. HSPF computes flow across a stream reach or reservoir based on two assumptions: (1) there is a fixed empirical relationship between depth, volume, and discharge, and (2) discharge is an empirical function of volume. This means that flow reversals and backwater effects in an upstream reach are not simulated, and momentum is not considered in HSPF routing computations [32]. In contrast, RANS and DNS provide inherent conservation of momentum (where momentum is the product of mass and velocity).

MODFLOW 6 represents or solves the groundwater flow equation which is a system of partial differential equations (PDEs) representing conservation of fluid mass, or continuity [33,34]. The groundwater flow equation is in terms of porous media fluid potential, also called hydraulic head. Hydraulic head is the sum of elevation and pressure heads, and it provides a description of fluid potential under nearly static conditions with very small fluid velocities. Because groundwater velocities must be very small, the groundwater flow equation and thus MODFLOW 6 do not represent or otherwise account for momentum. Darcy’s law is the empirical relationship that provides for calculation of fluid flux from the head gradient [35,36].

The purpose of discussing and comparing hydrological and hydrodynamical models is to identify expectations of the precision and accuracy of model representations. Hydrodynamical models, like DNS and RANS models, are expected to be both accurate and precise and to depict all forces driving fluid movement. Calibration is the identification of model parameters and inputs that produce simulated values that agree with equivalent measured values within a predetermined tolerance. DNS models do not require calibration and only require specification of parameters and inputs at very high spatial and temporal resolution. RANS models may require calibration because the time- and space-based averaging inherent to these methods will generate some degree of model bias. Observations, or measurements, for calibration of a RANS model could come from DNS model simulation results or a high-resolution direct measurement device like an acoustic Doppler current profiler (ADCP).

In contrast, hydrological models represent a subset of fluid movement forces and use empirical relationships to provide approximate, partial, and inherently limited depictions of reality. Precision and accuracy relative to high-resolution measurements are not expected. Rather, hydrological models provide “back of the envelope” analyses and best guesses that are helpful in constraining water resource management decisions and optimizing operational considerations. Due to the approximate and limited reality represented by a hydrological model, a collection of hydrological models may be needed to fully depict feasible conditions and scenarios for hypothesis testing and decision support. When order of magnitude and dimensional analyses suggest that a single type of hydrological model could adequately represent the fluid movement forces pertinent to a study, the collection of hydrological models can be composed of the same type of hydrological model (i.e., HSPF or the groundwater flow model from Figure 4) with collection members differentiated by parameterization. When a single type of hydrological model is inadequate to describe the pertinent force balance, the collection may consist of multiple types of models with multiple parameterizations.

DA allows for explicit inclusion of model limitations to the optimal combination of model simulation results with observations. Additionally, ensemble methods can generate collections of limited accuracy and approximate, but equally valid, models. Consequently, we suggest using DA in preference to calibration for hydrological models, as identified in Figure 4.

2.3. Data Assimilation (DA)

DA is an umbrella concept that covers a group of algorithms for optimal combination of information from imperfect numerical model simulations with noisy observations. It uses a “forward” numerical model to make predictions for unobserved quantities and system states, such that it relies on the processes and physics encoded in the model to fill in the gaps regarding system behavior that the available observations do not represent. Measurements, usually observed system states (or quantities derived from these states), are assimilated with the predicted equivalent values to derive updated values, typically using a Bayesian framework. History matching of simulated values to observations provides conditioning for, or statistical learning about, the optimal predictions for unobserved values. The goal is to obtain an optimal description of a dynamical system and the inherent uncertainty with the updated values. Herein, we use a formal data assimilation framework to update parameters, which represent uncertain model input quantities.

Bayes’ theorem, Equation (1), provides the conceptual basis for most DA algorithms. It shows how to rigorously update prior information as new information becomes available [1]. Equation (1) explicitly represents and quantifies parameter uncertainty, where k represents model parameters. Historic state observations (also known as “targets”) are h.

P

signifies a probability distribution,

P (k)

is the prior parameter probability distribution,

P (h | k)

is the likelihood function, and

P (k | h)

is the posterior parameter probability distribution.

P (k)

, the prior parameter probability distribution, is determined prior to assimilation from pertinent data sets and professional judgment. The posterior parameter probability distribution

P (k | h)

is the probability distribution of feasible model parameters, i.e., those in the prior parameter distribution, updated by conditioning to observations [37].

P (k | h) = P (h | k) P (k)

(1)

Variability among parameter fields that sample the posterior parameter probability distribution,

P (k | h)

, comes from three sources [1], which are enumerated below.

Inherent prior parameter uncertainty, $P (k)$ , from limitations in scientific understanding;
Data insufficiency caused by insufficient information within the observed data set of important outcomes, i.e., target observations, to significantly reduce parameter uncertainty. In other words, available data may be insufficient to condition the range parameter values from the naturally diffuse prior range;
Observation errors consisting of observation measurement errors and model representation errors. They can be explicitly represented in DA using an observation error model.

These three sources of variability are not independent. Inherent parameter uncertainty, observation error, and model representation error can only be ameliorated or limited through sufficient data.

2.3.1. Ensemble Methods

Ensemble methods are a class of data assimilation algorithms that use an ensemble of parameter realizations and the corresponding ensemble of important simulated outputs to approximate the first-order relationship between parameters and simulated outputs of interest. These parameter realizations are usually drawn from the prior parameter distribution (

P (k)

), such that after (repeated application of) the analysis step, the parameter ensemble is an approximate sample of the posterior parameter distribution (

P (k | h)

). The iterative ensemble smoother (iES) tool set in PEST++, PESTPP-IES [2,37,38], is a scalable and general implementation of the algorithm of Ref. [39], which has been used successfully to estimate posterior parameter ensembles in many environmental modeling settings.

PESTPP-IES was selected for this study to leverage three important advantages of ensemble methods relative to traditional parameter estimation approaches [2,19], which are enumerated below.

Ensemble methods can cope with higher levels of non-linearity in the parameter–output relationship compared to derivative methods.
- Watershed and surface water numerical models tend to have significant non-linearities because of the hyperbolic nature of the governing physical equations, i.e., DNS and RANS.
- The empirical relationships between volume and discharge in HSPF for the BRAAT project were designed to be non-linear to address competing demands between river discharge downstream and seepage loss discharge.
They tend to be relatively efficient computationally when dealing with very large numbers of parameters, which are commonly encountered in applied groundwater modeling.
- MODFLOW 6 represents groundwater flow using a PDE form of continuity. PDE models require that parameters be specified for every cell or node in the computational grid and are highly parameterized models because they have large numbers of parameters.
- For the BRAAT project, the MODFLOW 6 model has 2,324,780 active cells. Five parameters (hydraulic conductivity in the x-direction, $K_{x x}$ , hydraulic conductivity in the y-direction, $K_{y y}$ , hydraulic conductivity in the z-direction, $K_{z z}$ , specific storage, $S_{s}$ , and specific yield, $S_{y}$ ) are specified for each active cell, which results in the possibility of 11,623,900 flow property parameters in the absence of regularization.
PESTPP-IES provides explicit incorporation of many forms of observation error models, which account for observation error and model representation error, into the assimilation process.
- Observation error models for this effort are discussed in Section 2.3.2.
- A significant observation error is expected for the observations available in the study area, as discussed in Section 2.1.
- Hydrological models are used in the integrated hydrological modeling, as discussed in Section 2.2. Hydrological models entail expectations of approximate solutions without high-resolution accuracy and precision, which means that significant model representation error is expected.

PESTPP-IES was selected for these three reasons, even though it does not work with the existing NSMC analysis tools in PEST https://pesthomepage.org/ (accessed on 6 February 2025) [4,40] and data sufficiency limitations were identified in the Conceptual Model Report.

History matching skill between simulated and observed values is identified within PEST and PEST++ tool sets, including PESTPP-IES, using the objective function,

Φ

, with lower

Φ

values corresponding to increased history matching skill and likelihood in a Bayesian sense.

Φ

, in PEST++ tool sets, is the sum of squared weighted residuals, also known as “innovations” in DA terminology. A residual is the difference between an observed value and the corresponding simulated value. The goal of this inverse approach to model training is to find many samples from

P (k | h)

in Equation (1), which is the probability distribution of model parameters that are coherent with the prior and conditioned observations.

The parameters, listed below, from all three coupled models (HSPF, UZF, and MODFLOW 6) are adjusted during assimilation. These parameters are members of both

P (k)

and

P (k | h)

.

Some 184 UZF parameter values were optimized during assimilation, as shown in Table A5.
Some 409 HSPF parameter values, all of which relate to empirical relationships between volume and discharge in hydrological routing, were optimized, as listed in Table A5. These 409 parameters represent individual rows in 48 volume-to-discharge lookup tables, or FTABLES. Consequently, there are only 48 empirical lookup tables that internally have a few rows that vary during assimilation.
A total of 4746 MODFLOW 6 parameters were constrained as part of DA, as listed in Table A6.

Note that standalone assimilations were conducted for HSPF and MODFLOW 6 prior to integrated hydrological model assimilation. The primary purpose of the standalone assimilations was to generate a reasonable starting point and thus prepare the independent models for integration. Standalone HSPF assimilation also allowed for simplification and a reduction in the number of parameters assimilated, as the parameters listed in Table A7 were varied during standalone assimilation and then fixed for coupled assimilation. In contrast, standalone MODFLOW 6 assimilation was used to constrain prior parameter distributions to a tighter range than that dictated solely by professional judgment and did not serve to reduce the number of MODFLOW 6 parameters in coupled assimilation. A smaller prior range may provide computational efficiencies because fewer internal realizations in the assimilation (i.e., model simulations) can adequately sample the prior.

2.3.2. Observation Error Models

One source of

P (k | h)

variability is observation error. DA methods may, and PESTPP-IES does, provide support for arbitrarily complex observation error models, which can include observation measurement error and model representation error components. PESTPP-IES implementation of observation error model functionality is the inclusion of realizations of additive observation uncertainty (i.e., added to observed values) with

P (k)

realizations in the forward model used to explore

P (k | h)

uncertainty and, ultimately, the predictive uncertainty of the model. Additive observation uncertainty is included through the use of noisy observation values in each adjustment of each parameter realization during the data assimilation parameter adjustment process, resulting in unique innovations and objective function values for each parameter realization. A unique distribution of additive observation uncertainty is specified for each target observation location and history matching time interval [37].

In implementation, the adjusted, noisy observation value is obtained using a white noise error term, which is represented with a Gaussian variate. The amount of noise added to each target is stochastically sampled from the Gaussian variate, uniquely defined for each observation, for each realization and ensemble member in the assimilation. Observation error model Gaussian variates in PESTPP-IES are defined with a mean (

μ

) of zero and a standard deviation (

σ

) specified by the user. An observation error model is then implemented in PESTPP-IES by specifying a

σ

value for each observation. Note that separate target observations are defined for each time interval for all observation locations. Consequently, PESTPP-IES makes no assumptions, and applies no restrictions, with regard to the time stationarity of target observations or of observation error models.

Explicit inclusion of an observation error model allows ensemble methods to identify and examine prior-data conflict [41,42,43]. Prior-data conflict occurs where simulated values, produced from evaluating the prior parameter ensemble, do not agree in a statistical sense with the observed values plus additive uncertainty from the observation model. Agreement is typically measured by the statistical distance between the ensemble of simulated outputs and the ensemble of noisy observation realizations.

Lack of agreement indicates the presence of additional unaccounted-for sources of uncertainty and error such as model error. It implies that extreme parameter values or extreme parameter combinations will be needed to reproduce the conflicted observation values if DA is undertaken. In this case, extreme parameter estimates are “biased” estimates. Continuation of parameter adjustments in the presence of prior-data conflict, as part of history matching, will generate parameter bias and, depending on the outputs of interest to decision makers, could result in biased predictive outcomes and, ultimately, suboptimal resource management. The identification and elimination of prior-data conflict provides a way to address, or counteract, representation error related to the forward model’s ability to simulate observed values. When prior-data conflicts are removed, representation error is not propagated into

P (k | h)

. It should be noted that prior-data conflicts are only removed from the calculation of the objective function,

Φ

. The forward model still provides simulated values corresponding to the target observations, and these simulated values may be used in the calculation of validation metrics.

Water level elevations measured in wells and stream discharge values calculated at gauging stations are the target data sets for history matching as part of DA with ensemble methods. Water level observations are matched to simulated hydraulic head values in MODFLOW 6. Discharge observations are compared to discharge simulated by HSPF for each gauging station shown in Figure 2 using an empirical relationship between depth, volume, and discharge. Observation error models are derived for both types of target observations to include considerations of measurement error and model representation error.

As discussed in Section 2.1, a significant observation error is expected for water level elevations because an official location survey is not available for many of the wells; pumping occurs in most, if not all, observation wells, and it is not known when pumping has occurred relative to collection of water level observations; and water level measurements are available as raw observations, rather than scientifically processed data products. Observed water levels are point measurements that occur within a well casing that likely has a cross-sectional area between 0.02 and 0.08 m².

Groundwater water levels, which approximately correspond to water level observations, are simulated with MODFLOW 6 using a computational grid with maximum resolution of 5625 m² and minimum resolution of 1,440,000 m². Figure A1 shows rapid spatial variation in topographic elevation within the study area and identifies a high density of “Other Faults”. Each identified fault is a linear location of observed vertical offset or a change in topographic elevation. Consequently, significant model representation error is expected for water level observations. For target water level observations, a

σ

value of 9.14 m (30.0 ft) provides a temporally and spatially stationary observation error model. This

σ

value was somewhat arbitrarily selected based on a professional judge’s estimate of what the typical elevation error might be between the model representation and reality. Subsurface mixtures of materials, arrangement of hydrogeological units, and elevations of units are largely unobservable (at least at the scale of a grid cell), and so an expected error calculation is not feasible for

σ

estimation.

Ref. [19] provides the development, calculation, and documentation of observation error models for discharge observation targets, which are compared to HSPF-simulated discharge values, for all stream gauging stations shown in Figure 2. In this case, the root mean square error (RMSE) (see Equation (A2)) value for each observation time interval, which was developed from the stochastic demarcation of the discharge uncertainty envelope for each gauging station, provides

σ

values that define the observation error models for PESTPP-IES. There are 16 observation locations and 36 history matching time intervals for a total of 576 discharge observation error models. Discharge observation error models are temporally and spatially non-stationary and incorporate an amount of uncertainty in agreement with the amount of measurement error expected in the flow regime.

2.3.3. Complexity and the Bias–Variance Tradeoff

History matching is used to select optimal parameters in an inverse-style calculation to implement statistical learning and training in AI-ML calibration for models employing algorithms and physics-based representations that are expected to have a high degree of accuracy and precision as well as data assimilation used to combine approximate models with noisy and imperfect observations. Inverse-style parameters mean that parameters are varied across a range of values identified from professional judgment in order to find which parameters, or collections of parameters, provide simulation results that compare most favorably with independently observed values. As part of training, calibration, and assimilation, prediction bias is traded against variance in predicted values using model complexity to avoid overfitting [4,37,44,45,46].

Overfitting occurs when model performance in terms of history matching is significantly better during training, calibration, or assimilation relative to the best-fit model performance on independent validation observations. An independent validation data set is one that has not been part of parameter optimization during training, calibration, and assimilation. Overfitting occurs because the optimization of model parameters seeks the best performance on the target observations and can learn noise and errors unique to the target observation data set when models and observations are imperfect [4,37,44,45,46].

Figure 5 provides a graphical explanation of the tradeoff between prediction bias and variance that is moderated by model complexity. The bias–variance tradeoff is that variance generally increases and bias decreases with increased model complexity, and that the opposite occurs as complexity decreases. Regularization is the reduction in dimensionality of the model and in the number of model parameters optimized during history matching, which is a reduction in the number of degrees of freedom in the analysis. Consequently, regularization techniques reduce model complexity, and a model is made more complex by increasing dimensionality and the number of parameters optimized during history matching. Panel (A) describes the bias–variance tradeoff for perfect observations while Panel (B) description applies to noisy and biased observations [46].

Figure 5B presents the modified bias–variance tradeoff for DA where observation and model prediction uncertainty are explicitly represented using an observation error model. The goal of an observation error model is to introduce and maintain variance, or variability, in the

P (k | h)

fields with the purpose of optimizing the bias–variance tradeoff for DA, as described schematically in Figure A3 [19]. The point of optimizing the variability in

P (k | h)

is to introduce sufficient variability to avoid overfitting and selection of

P (k | h)

fields that are too narrow and produce biased predictions. The associated and counterbalancing consideration in this optimization is that too much variability results in the inability to differentiate among options or scenarios because any answer is feasible. Consequently, too much variability reduces the capacity of the model to support decision making [46].

Regularization techniques are used to simplify HSPF and MODFLOW 6 representations as identified below. HSPF is an inherently simplified formulation relative to MODFLOW 6 because HSPF solves a network of ODEs, while MODFLOW 6 solves a system of PDEs. ODEs have one dimension that changes, and PDEs represent changes in multiple dimensions. Consequently, HSPF is a reduced dimensionality representation and has reduced complexity relative to MODFLOW 6.

Regularization to reduce HSPF model complexity
–
A standalone HSPF-only assimilation was conducted prior to the integrated model training. The watershed parameters listed in Table A7 were optimized during this preliminary assimilation and then fixed for integrated hydrological model assimilation. The result is that only 409 HSPF parameter values were varied during integrated hydrological model assimilation.
Four regularization techniques provided reduction in number of assimilated parameters from more than eleven million to less than five thousand for the MODFLOW 6 model component. Additional details are provided in Appendix A.2.3.
- Regionalization into seven water budget regions;
- Regionalization into homogenous hydrostratigraphic zones within water budget regions for relatively low-transmissivity units, see Figure A4 and Figure A5 and Table A8 and Table A9;
- Geostatistical interpolation within hydrostratigrapic zones for the four aquifers, see Figure 6 for pilot point (PP) locations;
- Use of the empirical relationship in Equation (A1) so that one storage parameter, instead of two, is included in optimization of parameter values.

2.4. Null Space Sensitivity Analysis

Sensitivity is the variation in model solution values due to variability or uncertainty in one or more model input values. To improve the representativeness of sensitivity analysis, it should be conducted after history matching and designed to respect posterior parameter distributions (

P (k | h)

in Equation (1)) [3,4,47]. The purpose of sensitivity analysis is to explicitly identify the relative amount of expected variability in the important simulated outcomes arising from one or more parameters, where the parameters vary within the spread of

P (k | h)

. There are four types of sensitivity identified in Ref. [47] and listed below, which determine if and where samples from

P (k | h)

identified through DA can provide effective decision support.

Type I: variation in a parameter within the bounds of the prior parameter distribution, $P (k)$ in Equation (1), causes insignificant changes in $Φ$ , which is the goodness-of-fit objective function for history matching with PEST++, and in model predictions employing parameters from the posterior parameter distribution, $P (k | h)$ . Note that $P (k | h)$ is a conditional subset of $P (k)$ .
Type II: variation of a parameter causes significant changes in $Φ$ but insignificant changes in model predictions.
Type III: variation of a parameter causes significant changes to both $Φ$ and model predictions.
Type IV: variation of a parameter causes insignificant changes to $Φ$ but significant changes to the model predictions.

DA using PEST++ or PEST provides an inverse-style approach for determination of

P (k | h)

from

P (k)

. Parameters are in the null space when variation of these parameters across

P (k)

causes insignificant changes to

Φ

. Parameters that have Type I and IV sensitivity exist in the null space, and these parameters can be set to any reasonable value without impacting history matching skill. Type I parameters show minimal change in spread between the prior and posterior parameter distributions, and, more importantly, they have no impact on model predictions. However, Type IV parameters do impact model predictions even though the spread of posteriors is minimally constrained relative to prior parameter distributions. This means that model predictions using

P (k | h)

provide limited decision support because the posterior parameter distributions are not constrained by observations, and the model itself is known to be imperfect (because DA should only be used with imperfect models, and all water movement models are imperfect with the exception of DNS).

Derivative-based inversion algorithms can produce a full-rank Jacobian matrix. The NSMC method uses these full-rank Jacobian and subspace concepts to precondition

P (k)

realizations so that they honor observation data within the limits of an implied linearity assumption, essentially yielding posterior parameter realizations in a very numerically efficient manner. Several open-source tools implement the NSMC algorithm to facilitate efficient predictive uncertainty analysis in relatively high dimensions and at real-world scales [4,40].

Alternatively, ensemble-based methods, such as those in PESTPP-IES, seek a posterior parameter ensemble using an empirically approximated Jacobian matrix that is relatively rank-deficient. This rank deficiency means that the NSMC approach cannot be applied directly when using ensemble methods. While ensemble methods seek a posterior parameter ensemble, they may not fully sample the range of Type IV parameter sensitivities unless a large number of realizations are used. The ability to leverage a relatively small number of realizations provides computational efficiency advantages for ensemble methods when using very large numbers of parameters. If a large number of realizations must be used, then the computational advantages of ensemble methods relative to other methods, like Markov chain Monte Carlo (MCMC), are reduced.

In the implementation of PESTPP-IES assimilation for parameter estimation, the scientist provides PESTPP-IES with the ranges for the priors as program inputs. PESTPP-IES executes the assimilation as specified by the scientist. The scientist must then select the posteriors by analyzing the entire assimilation record. Selection of posteriors will always be dependent on the forward models and observation error models employed, site-specific observations used, and the dynamic characteristics of the time interval covered by the assimilation. Consequently, posterior selection is an empirical process that is site- and study-dependent and that must be completed prior to sensitivity analysis.

A

Φ

threshold is one way of selecting posteriors from the complete assimilation record. When using a threshold, all realizations across all iterations are filtered to the collection of realizations that produce a

Φ

value that is less than the threshold. The parameter ranges from the collection of realizations with

Φ

less than the threshold identify the selected posterior ranges. When using a

Φ

threshold, the goal is to select a threshold value that produces a large enough collection of realizations to reasonably estimate posterior ranges but small enough that the posterior ranges are different from the prior ranges for sensitive parameters. A

Φ

threshold is also used in NSMC to identify equally plausible collections of parameters, or models, as part of null space delineation.

The three important relative advantages of PESTPP-IES, from Section 2.3.1, are (1) the ability to work with non-linearity in the parameter–output relation, (2) computational efficiency for large numbers of parameters, and (3) incorporation of observation error models. The empirically approximated Jacobian matrix results combined with a relatively small number of realizations for computational efficiency means that the range of parameter sensitivities is not fully or robustly sampled during assimilation using ensemble methods. To conceptually address this shortcoming for null space sensitivity analysis, the spread of the prior for estimation of parameter sensitivity should be as large as possible under the constraints of professional judgment and site-specific knowledge, and the spread of the selected posterior should be limited to some degree to reduce the impacts of outliers.

One way to constrain the spread of the posterior is to utilize a central description of the posterior range in place of the full range. Observation error models function to increase the variability of target observations used for history matching. The increase in target variability to account for model and measurement error necessarily leads to an expectation of an increased spread in the posterior and an increased likelihood of outliers in the posterior, and it reinforces the conceptual need to employ a measure of the central region to provide constraint on the spread of the posterior for parameter sensitivity evaluation.

Our conceptual framework for implementing a null space sensitivity analysis based on an assimilation employing ensemble methods is enumerated below. This framework must be empirically implemented and customized for specific study sites and forward models.

Parameter sensitivities are calculated by using a ratio of the description of the spread of the prior to the description of the spread of the posterior;
- Because of sampling efficiencies and the resulting limited number of realizations, the prior description should tend towards describing the largest supported spread, and the posterior description should be constrained to depict a central region;
The null space is estimated by determining which sensitivity metric values correspond to sensitive and insensitive parameters
- This is expected to be a trial-and-error process that is site-specific, dependent on data sufficiency, and influenced by observation error models.
- An approach that we have found useful is to employ sensitivity metric quartiles as the first step in the search for the sensitive versus insensitive demarcation and then to refine to decile metric values if needed.
- Each demarcation metric percentile that is evaluated will need to be used in forward model simulation with sensitive parameters fixed to final model values and insensitive parameters set to a prior boundary value (e.g., $B n d_{L}$ or $B n d_{U}$ ) to determine whether this is an effective demarcation level. The effective demarcation level is the largest percentile sensitivity metric value that maintains model validation.
An ensemble of sensitivity analysis models is created, leveraging the effective demarcation sensitivity metric value to fix sensitive parameters to final model values for all ensemble members and to vary insensitive parameters within the prior ranges across ensemble members;
- It is confirmed that model validation is retained for all sensitivity analysis ensemble members;
Sensitivity analysis model simulation results are evaluated for variations to or changes in model predictions;
- Changes in model predictions then identify Type IV sensitivity.

We develop and apply an empirical null space sensitivity analysis in Section 3.2 that follows the conceptual framework outlined above and compliments the ensemble method framework through empirical analysis of the relative ranges of the prior and posterior parameter distributions. This sensitivity analysis is empirical and relies on postmortem investigation of the uncertainty analysis provided by DA rather than on a large number of realizations.

3. Results

The pertinent results are (1) the selection of posteriors and validation of assimilation and (2) the development and implementation of the novel empirical null space sensitivity analysis for ensemble methods. Validation is the implementation of history matching for an independent data set that was not previously part of constraining prior parameter distributions in order to generate posterior parameter distributions. Validation is presented in Section 3.1, and additional information is provided in Appendix A.2.4. Development and implementation of the novel empirical null space sensitivity analysis occurs post validation and is presented in Section 3.2.

In terms of the specifics of the assimilation of the integrated hydrological model relevant to the results presented, PESTPP-IES was configured for use in three iterations in addition to the prior-ensemble evaluation which is used to find prior-data conflicts. There were 300 realizations in each iteration, and each realization is a seven-year integrated hydrological model simulation. This configuration provides about 1200 total realizations where each realization employs a unique ensemble of parameters and is a unique model. There were 6086 observations of historic system state that were assimilated. Observation error models were used for all observations as discussed in Section 2.3.2. A total of 472 prior-data conflicts were removed, which means that

Φ

optimization occurred using 5614 historic state observations.

Computational efficiency is a relative advantage for ensemble methods of parameter estimation and is usually evaluated in terms of cost. The cost of a compute-month of the capabilities necessary to perform a simulation in about six hours was USD 330 during this study. Execution of the assimilation required about 20 compute-months, which were consumed across an interval of 10 consecutive days for a total cost of computation of approximately USD 6600.

During project scoping, we estimated the total compute cost for PEST DA, final model selection, and NSMC to be USD 30,000. Our intent was to utilize regularization techniques as needed to reduce the number of parameters varied in the assimilation to the number of parameters that would allow us to meet the preliminary budget requirements. The first step in any PEST DA is the initial estimate of the full-rank Jacobian matrix, which requires one model simulation for each parameter that is varied. The PESTPP-IES DA that cost about USD 6600 varied 5339 parameters; consequently, the creation of the full-rank Jacobian matrix would have required about 44 compute-months at a cost of approximately USD 14,000. A rough estimate of the total computation time that would be required for PEST DA, final model selection, and NSMC is twenty times that needed for the initial estimate of the full-rank Jacobian matrix, which means that the equivalent NSMC analysis would have cost around USD 280,000. If we had used PEST and NSMC, we would have used additional regularization to limit total computation costs to USD 30,000, and the cost of increased regularization is increased model bias. Finally, the cost for a scientist-month was approximately USD 25,000 during this project. We estimate that the same number of scientist-months would have been consumed for this study if PEST and NSMC had been used in preference to PESTPP-IES.

3.1. Validation of Assimilation

Metrics for validating goodness-of-fit are Nash–Sutcliffe efficiency (NSE) [48,49] for discharge and normalized root mean square error (NRMSE) for water levels in wells. NSE is calculated using Equation (A3). NRMSE is the normalized RMSE and is determined as discussed in Appendix A in conjunction with Equation (A2).

Ensemble methods produce an ensemble of best-fit models. To determine the collection of best-fit models, a

Φ

threshold value was selected that filtered the approximately 1200 realizations down to a collection of 101 realizations that empirically define the posterior parameter distributions for this study. a target size of 101 was selected for the collection of best-fit models to provide a collection that was large enough to describe posteriors but small enough that posteriors could have a small spread relative to priors for sensitive parameters.

For this study, the project scope required selection of a “final” model from the ensemble of best-fit models. The final model was extracted from the 101 lowest

Φ

realizations based on the validation metric evaluation shown in Table 1. The final model provides the largest total discharge target NSE (“Total All Gauging Stations”) and the largest total important spring discharge target NSE (“Total Major Springs”).

Table 1 displays the comparison of validation metrics from the final model to the acceptable range of predefined metric values. The traditional range of NRMSE for validation of the matching between groundwater model simulated head elevations and observed water level elevations in wells is <10% [50], and the NRMSE metric value in Table 1 is 5.8%. The validation ranges for NSE values for stream gauging stations come from Ref. [19], Table 6: “Monthly ‘goodness-of-fit metrics’ between gauge discharge and synthetic biased discharge time series”. The most important NSE metric validations in Table 1 are “Total Major Springs”, “Total Major Focus Springs”, and “Total All Gauging Stations”, and the metric value is greater than the minimum validation threshold for these three aggregated metrics.

3.2. Empirical Null Space Sensitivity Analysis for Ensemble Methods

The empirical identification of parameter sensitivity compares the range of parameter values in the prior distribution to the range of parameter values in the posterior distribution. Here, the selected central description of the posterior parameter distribution was the interquartile range (IQR) from the 101 lowest

Φ

realizations,

I Q R_{101}

. Empirical parameter sensitivity,

S_{e}

, is defined with Equation (2), where

B n d_{U}

is the maximum value in the prior parameter distribution as input to PESTPP-IES for coupled assimilation, and

B n d_{L}

is the minimum value in the prior parameter distribution.

For this project, two different prior parameter distributions are available. The standalone assimilations, discussed briefly in Section 2.3.1, employ the traditional prior parameter distribution representation of the plausible range of parameter values derived from existing system knowledge and expert judgment. The purpose of standalone assimilations was to move the independent models to a reasonable first guess of the representation for dynamic coupling, and the standalone assimilations generated posterior parameter distributions that were used as prior parameter distributions for integrated hydrological model assimilation.

B n d_{U}

is the maximum value in the prior parameter distribution, and

B n d_{L}

is the minimum value in the prior parameter distribution, used in integrated hydrological model assimilation.

S_{e} = \frac{B n d_{U} - B n d_{L}}{I Q R_{101}}

(2)

IQR provides an empirical description of the central region of the posterior. Observation error models function to incorporate extra variability by increasing the spread or variability of target observation values seen by the assimilation. When the

σ

value that describes the observation error model to PESTPP-IES is relatively large when describing a corresponding amount of uncertainty, more solution variability will likely produce more parameter insensitivity and a larger null space. Figure 7 shows the adjustments to the Hayes Trinity Groundwater Conservation District (HTGCD) well ID 729 observations that are sampled from the observation error model (with a Gaussian variate defined with

μ

= 0 and

σ

= 9.14 m) for this well. The sampled adjustments can be greater than 15% of the observed elevation and introduce significant uncertainty into the assimilation results, and they can generate outliers in posterior parameter distributions.

S_{e}

identifies the certainty related to the posterior parameter distribution from the posterior ensembles. The smaller the

I Q R

, the greater the certainty and the smaller the variability for the posterior parameter distribution. A relatively small

I Q R_{101}

and thus a relatively large

S_{e}

suggest significant changes to

Φ

during assimilation, which means that a parameter does not have Type I or IV sensitivity.

To implement the empirical null space sensitivity analysis,

S_{e}

was calculated for all adjustable parameters using the 101 best-fit realizations. The 75th percentile

S_{e}

was 5.966, which denotes an

I Q R_{101}

value with approximately 16.8% of the spread of the prior parameter distribution range. It is the threshold selected to delineate sensitive (those with

S_{e} \geq

5.966) from insensitive parameters (those with

S_{e} <

5.966). A trial-and-error approach was employed to find this effective value of the sensitivity metric for demarcation, as outlined in Section 2.4. Figure 8 and Figure 9 display parameter samples for sensitive and insensitive, respectively, storage parameters from standalone and coupled assimilations.

Five sensitivity realizations, or models, were created using

S_{e}

values to examine the impact of Type IV parameter sensitivity on important simulated outcomes. Comparisons of validation metrics in the 101 lowest

Φ

realizations suggest that varying porous media parameters, that is, UZF and MODFLOW 6 parameters, is more likely to lead to insignificant changes to

Φ

. The NRMSE, which applies to simulated water levels in MODFLOW 6, only varied by 0.9% in relative range and 0.7% in absolute value across the 101 lowest

Φ

realizations. Consequently, all 101 lowest

Φ

realizations performed approximately equally in terms of history matching to water level elevations.

In contrast, HSPF parameters relate to the empirical relationship between volume and discharge and directly impact simulated values that are matched to discharge target observations. For the total NSE from the three important spring targets, the relative variation was 23% and the absolute range was 0.3 across the 101 realizations. Similarly, the total NSE across all discharge target locations had a relative variation of 4% and an absolute range of 0.4. The NSE for discharge at gauging stations is sensitive, i.e., it varies appreciably depending on the realization selected from the posterior parameter distributions. This indicates that the HSPF parameters are not in the null space and therefore should not be varied as part of the null space sensitivity analysis.

To create the five sensitivity realizations, insensitive groundwater flow, UZF storage, and hydraulic conductivity parameters were varied across the ranges listed below because of the small amount of change to the NRMSE validation metric across the 101 ensembles. The range of variation for

S_{y}

was determined from professional judgment and the results of standalone assimilation. Consequently, the null space sensitivity analysis range for

S_{y}

is somewhat smaller than the prior range for standalone assimilation but is generally larger than the coupled assimilation prior range. These five realizations were constructed such that they sample a wide, but plausible range of Type IV parameters in an effort to explore the full posterior range of important simulated outcomes and to bound the extreme but plausible predictive results. The five realizations include the following.

S1, sensitivity analysis ensemble 1: $S_{y}$ set to 0.15. Hydraulic conductivity, K, values, x-, y-, and z-directions, set to $B n d_{U}$ .
S2, sensitivity analysis ensemble 2: $S_{y}$ set to 0.20. K values, x-, y-, and z-directions, set to $0.75 \times (B n d_{U} - B n d_{L}) + B n d_{L}$ .
S3, sensitivity analysis ensemble 3: $S_{y}$ set to 0.25. K values, x-, y-, and z-directions, set to $0.50 \times (B n d_{U} - B n d_{L}) + B n d_{L}$ .
S4, sensitivity analysis ensemble 4: $S_{y}$ set to 0.30. K values, x-, y-, and z-directions, set to $0.25 \times (B n d_{U} - B n d_{L}) + B n d_{L}$ .
S5, sensitivity analysis ensemble 5: $S_{y}$ set to 0.35. K values, x-, y-, and z-directions, set to $B n d_{L}$ .

In sensitivity analysis realizations, sensitive (

S_{e} \geq 5.966

) groundwater flow and UZF parameters were given the parameter values from the final model. Insensitive (

S_{e} < 5.966

) groundwater flow and UZF parameters were assigned values based on the set of rules listed above. HSPF parameters were fixed to final model values because simulated stream discharge was sensitive to variations in HSPF parameters as noted above. MODFLOW 6 drain stress package (DRN) conductance parameters, adjusted as part of coupled assimilation and listed in Table A6, were assigned final model values for all sensitivity analysis ensembles because DRNs are directly connected to HSPF as part of dynamic coupling. Furthermore, groundwater flow and UZF parameters in the BFZ Fault Block Water Budget Region #2, see Figure 3 and Table A4, were assigned final model values. Use of final values for these parameters was necessary to maintain the best-fit discharge and prediction skill, as simulated for San Marcos Springs (8170950) by the final model.

Table 2 displays goodness-of-fit validation metrics from the final model and the five sensitivity analysis ensembles. There is little variation in these metrics across the six models, thus identifying that all six are equally validated and that the groundwater flow and UZF parameters that varied for the sensitivity analysis are insensitive and null space parameters. This means that these parameters have Type I or IV sensitivity.

Table 3 provides a summary of selected simulated storage values from the final model and the five sensitivity analysis realizations. The variation in storage among these equally validated models indicates that different model conclusions are provided. As a result, the groundwater flow and UZF parameters, which are identified as insensitive, have Type IV sensitivity because changing these parameters does not significantly impact either

Φ

or model validation but does impact model predictions, as shown in Table 3.

The residence time multiplier value in Table 3 was derived from Equation (3) for residence time,

R_{t}

, as the ratio of the volume for the target model to that of sensitivity analysis model S1 under the assumption that total outflow discharge is approximately equivalent across the final model and the five sensitivity analysis models. All aquifer outflow discharges affect stream discharge validation at the gauging stations, and the similar NSE validation metric values in Table 2 suggest that total outflow discharge is approximately constant across these six models. Equation (3) defines

R_{t, i}

where

V_{s, i}

is the volume of water in storage in the fault-bounded aquifer segment,

Q_{o u t}

is the total outflow discharge and includes spring discharge and pumping, and i represents the particular model with

i = 0

being the final model,

i = 1

representing S1, and so forth.

R_{t, i} = \frac{V_{s, i}}{Q_{o u t}}

(3)

The residence time multiplier,

R m_{i}

, was calculated using Equation (4), which normalizes the storage volume for model i using the storage volume for model S1. S1 was selected for normalization because it will always have the smallest storage volume because it uses the smallest

S_{y}

value for insensitive parameters.

R m_{i}

denotes the relative amount of time taken to drain the aquifer when there are no inflows and under approximately current conditions of spring discharge and pumpage.

R m_{1}

is always one in Table 3 because S1 provides normalization, and

R m_{i}

values vary from one to two, with two denoting that it takes twice as long to completely drain the aquifer for the S5 model as for the S1 model.

R m_{i} = \frac{V_{s, i} / Q_{o u t}}{V_{s, 1} / Q_{o u t}} = \frac{V_{s, i}}{V_{s, 1}}

(4)

Two primary threats to future water resource availability examined in the BRAAT project are increased drought frequency and magnitude and increased water demands resulting from economic development.

R_{t}

and

R m_{i}

are directly applicable to decision support for responding to these threats because severe drought reduces the inflows to aquifers and presents an environment approximating the residence time calculation for the amount of water mining time available during prolonged drought conditions. Increased water demand produces increased

Q_{o u t}

, as the amount of pumpage increases relative to current conditions. Increased

Q_{o u t}

causes reduced

R_{t}

. The

R m_{i}

values in Table 3 suggest that the BRAAT-integrated hydrological model places a limited constraint on the absolute amount of water that is available to be mined from aquifer storage, and it provides limited decision support related to total available groundwater storage.

4. Discussion

DA explicitly describes the predictive uncertainty generated by (1) limitations on scientific understanding, (2) data insufficiency, and (3) observation and model representation errors. Consequently, it should be used to estimate posterior parameter values when hydrological models are used to assimilate noisy and flawed observations. PEST++ and PEST provide DA and were created specifically to address and characterize inherent parameter uncertainty. PESTPP-IES is the PEST++ tool set member that provides an explicit observation error model for assimilation. NSMC analysis is a PEST tool set for examining data sufficiency.

The purpose of the empirical null space sensitivity analysis presented in Section 3.2 is to examine data sufficiency for ensemble methods, i.e., PESTPP-IES. Insufficient water level elevation data are available to constrain the absolute volume of water stored in the four aquifers of interest across the seven water budget regions, as shown in Figure 3, resulting in limited decision support for integrated water resourcs management in terms of absolute amounts and residence times. The appropriate correction method for data insufficiency is to collect more and better measurements to constrain feasible groundwater flow and storage property values for integrated hydrological modeling.

4.1. Model Complexity and Bias–Variance Tradeoff

The empirical null space sensitivity analysis suggests that variations in parameters, which are part of the empirical relationships between volume and discharge in HSPF, generate significant changes to both

Φ

and model predictions. Does this mean that HSPF provides a “better” model or representation relative to MODFLOW 6? No, this only means that HSPF is a less complex model that provides less solution variability and a more unique solution within this implementation.

As discussed in Section 2.3.1 and Section 2.3.3, HSPF is less complex because it is an ODE representation, and only 409 parameters are varied during assimilation. Additionally, these 409 parameters are sub-parameters within 48 empirical volumes used to discharge relationships. In contrast, MODFLOW 6 is a PDE representation in which more than 4700 parameters are varied during assimilation. This means that the HSPF representation has somewhere between 9% (409 ÷ 4746) and 1% (48 ÷ 4746) of the degrees of freedom of the MODFLOW 6 representation.

Figure 5 and Section 2.3.3 identify that the cost of a more unique solution and less solution variability is an increase in model bias. This means that the HSPF representation is likely more “wrong” than the MODFLOW 6 representation. It is, however, more consistently wrong because there are fewer parameters to vary and thus the representation can really only represent a small subset of feasible conditions across the study site.

The MODFLOW 6 representation has too many degrees of freedom (relative to the available observations) and thus produces too much solution variability. The amount of solution variability is directly increased in correspondence with the magnitudes of the observation error models used in assimilation. The downside of too much solution variability is limited decision support and an inability to differentiate among options or scenarios because any answer is feasible. The empirical null space sensitivity analysis demonstrates that a wide variety of answers are feasible because of data insufficiency. From the bias–variance tradeoff discussion, expectations of lower model bias accompany increased solution variability. Unfortunately, there are not sufficient observations to constrain the posterior parameter fields to the lower biased regions of the solution space.

4.2. Addressing Uncertainty from Imperfect Models and Insufficient Data

Observation error models are employed in this assimilation because hydrological models are imperfect and represent approximate solutions and because significant observation error is expected for target observations due to measurement error and reliance on approximate calculations. Ensemble methods provide an inherent approach for describing uncertainty because an ensemble of best-fit models produces an ensemble of predicted future system states. When the prediction ensemble is cast to a probabilistic time history, the ensemble of predicted futures provides a likelihood for each future system state.

However, empirical null space sensitivity analysis identifies that the spread of any probabilistic time history obtained from the ensemble of sensitivity models and the final model will be too broad and too variable for effective decision support. Hydrological and hydrodynamical models provide process- and physics-based, respectively, depictions of the movement and storage of water. The purpose of these representations within DA is to allow processes and physics encoded in the models to fill in the gaps among observations. Another advantage of hydrological and hydrodynamical models is that it can be expected that changes in input forcing will propagate to process- and physics-based changes in model predictions. The result is that hydrological and hydrodynamical models will produce physically consistent results. Hydrological models will be consistently “wrong” in absolute terms due to inherent physical representation limitations and will be consistently “right” in relative terms due to the underlying process- and physics-based representation.

Production of physically consistent results means that a deterministic process- or physics-based model can be used in risk and sustainability analyses for relative decision support even when reliable, absolute predictions of volume or residence time are unavailable. For useful relative analyses, simply use the same deterministic model with base case forcing and with scenario forcing, which represents the future change of concern. For the BRAAT project, the changes in concerns would be future weather patterns and future extractions from pumping. The simulation result differences between future scenarios and the base case can then be compared. A deterministic process- or physics-based model will produce consistent bias, which will be conceptually zeroed out through differentiation of results [51,52].

4.3. Future Work

Ideally, future work will involve collecting observations from new facilities that include dedicated monitoring wells and piezometers. However, the construction of wells and dedicated monitoring facilities is expensive and time-consuming. It will likely take decades to address the data insufficiency issues identified from the null space sensitivity analysis. Independent of measurement and observation collection, future work will involve risk, impacts, and sustainability analyses using integrated hydrological modeling and DA to explicitly depict and account for imperfect models and limited data using relative scenario-based analyses.

The main modeling-related evolution that will occur is the movement to less complex, or simpler, model representations. Section 4.2 identifies that the available observations cannot support a highly parameterized PDE-based approach. The idea of moving to simpler models, derived to set the sensitivity analysis ensemble members to predefined accuracy and precision thresholds, is to leverage the computational efficiencies provided by simpler models. Sensitivity analysis models require about six hours for a seven year simulation. It should be feasible to create reduced-dimensionality reproductions that require mere minutes for execution of climate risk and sustainability simulations that cover more than 30 years.

An ensemble of simpler models which predict differences between future scenarios and a base case can and will provide valuable decision support related to future climate change and increased economic development and water consumption. Feasible reduced-dimensionality modeling approaches that can be considered include kernel methods [53], data space inversion (DSI) [54], and systems dynamics representation. System dynamics modeling is the currently preferred approach because it is a reduced-dimensionality approach that still explicitly provides a process- and physics-based representation. It seeks custom understanding and representation of the non-linear dynamic behavior of complex systems using stocks (stocks are reservoirs or storages), flows, internal feedback loops, table functions, and time delays (convolution is a standard mathematical representation of time delay) [55].

5. Conclusions

A null space sensitivity analysis for ensemble methods that utilize observation error models is developed and implemented to provide one component of predictive uncertainty analyses conducted for integrated hydrological modeling within the BRAAT project. Empirical parameter sensitivity is determined from the comparison of the range of parameter values in the prior distribution to the range of parameter values in the posterior distribution. An ensemble of sensitivity analysis models is then created by varying insensitive parameters while fixing sensitive parameters to best-fit final model values. This sensitivity analysis ensemble identifies Type IV sensitivity because the amount of water in storage in the Middle Trinity Aquifer may as much as double across the sensitivity analysis ensemble members. Null space sensitivity analysis describes uncertainty by utilizing data insufficiency.

Ensemble methods employing observation error models provide robust predictive uncertainty analysis because observation error models explicitly describe uncertainty from observation errors and from model representation errors. It is important to include considerations of model representation error when using hydrological models because these algorithms provide approximations rather than accurate predictions due to the prevalence of empirical relationships and incomplete representation of the physical forces driving water movement. When null space sensitivity analysis is included, ensemble methods with observation error models provide robust predictive uncertainty analyeis that include uncertainty from data insufficiency, imperfect forward models, and noisy and biased observations.

Author Contributions

Conceptualization, N.M., J.W. and P.S.; methodology, N.M. and J.W.; software, N.M., J.W. and P.S.; writing—original draft preparation, N.M.; writing—review and editing, N.M., J.W. and P.S.; validation, N.M.; funding acquisition, N.M. All authors have read and agreed to the published version of the manuscript.

Funding

pyHS2MF6 development, testing, and verification were funded by the SWRI Internal Research and Development Grant 15-R6015. Data set acquisition and processing and integrated hydrological model configuration were funded under the Texas State University Subaward 21040-83739-1. Data assimilation and validation were funded under the Texas State University Subaward 22026-83949-1.

Data Availability Statement

The information and source code generated for this study can be found via the links in Table A1. Data sets analyzed for this study can be found via the links in Table A2 and can be requested via the information provided in Table A3.

Acknowledgments

The authors acknowledge the contributions of two anonymous reviewers and the guidance of the academic editors whose comments and suggestions improved the quality of this paper.

Conflicts of Interest

Author N.M. was employed by Vodanube, LLC, and author J.W. was employed by INTERA Incorporated. The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

ADCP	Acoustic Doppler current profiler
BFZ	Balcones Fault Zone

BRAAT	Blanco River Aquifers Assessment Tool
CFD	Computational fluid dynamics
DA	Data assimilation
DNS	Direct numerical simulation
EnKF	Ensemble Kalman filter
FTABLE	Volume to discharge empirical lookup table in HSPF
HRU	Hydrologic response unit
HSPF	Hydrological Simulation Program–FORTRAN
iES	Iterative ensemble smoother
$K_{x x}$	Hydraulic conductivity in the x-direction
$K_{y y}$	Hydraulic conductivity in the y-direction
$K_{z z}$	Hydraulic conductivity in the z-direction
MA	Monthly-averaged water level elevation
MCMC	Markov Chain Monte Carlo
NRMSE	Normalized root mean square error
NSE	Nash–Sutcliffe Efficiency, Equation (A3)
NSMC	Null Space Monte Carlo
ODE	Ordinary differential equation
PDE	Partial differential equation
PERLND	Pervious land segment in HSPF
PP	Pilot point
PT	Point-in-time water level elevation
pyHS2MF6	Integrated hydrological model
RANS	Reynolds-averaged Navier–Stokes equations
RCHRES	Well-mixed reservoir structure in HSPF
RMSE	Root mean square error, Equation (A2)
$S_{s}$	Specific storage
$S_{y}$	Specific yield
TIFP	Texas Instream Flow Program
TRIM	Tidal, Residual, Intertidal Mudflat Model
TX	Texas
UZF	Unsaturated Zone Flow advanced stress package in MODFLOW 6

Appendix A

This appendix provides data set availability summary tables in Appendix A.1 and summary information concerning the Blanco River Aquifers Assessment Tool (BRAAT) project in Appendix A.2.

Appendix A.1. Data Availability Tables

Table A1. Publicly available data and information from this study.

Data Source Name	Last Accessed
Conceptual Model Report	28 August 2024
pyHS2MF6 Source Code	28 August 2024
pyHS2MF6 Development Documentation	28 August 2024
Calibration Files and Scripts	28 August 2024
Calibrated and Validated Model Files	28 August 2024
Draft BRAAT ¹ HSPF Model: Standalone Setup and Configuration	28 August 2024
Draft BRAAT ¹: MODFLOW 6 Setup and Configuration Documentation	28 August 2024
Draft BRAAT ¹ Phase II, Components A & B	28 August 2024

¹ BRAAT is the Blanco River Aquifers Assessment Tool.

Table A2. Publicly available data and information available from online portals.

Data Source Name	Last Accessed
Water Data for Texas	2 September 2024
Web Soil Survey	29 July 2020
United States Geological Survey (USGS) Discharge Data	13 September 2022
Lower Colorado River Authority (LCRA) Discharge Data	13 September 2022
PRISM Climate Group	28 August 2024
TWDB ¹ Historical Groundwater Pumpage	28 August 2024
Texas Water Rights Viewer	28 August 2024
NPDES Monitoring Data Download	28 August 2024

¹ TWDB is Texas Water Development Board.

Table A3. Agencies with publicly available data and information available by request.

Data Source Name	Last Accessed
Edwards Aquifer Authority (EAA)	2 September 2024
Barton Springs Edwards Aquifer Conservation District (BSEACD)	2 September 2024
Blanco-Pedernales Groundwater Conservation District (BPGCD)	2 September 2024
Cow Creek Groundwater Conservation District (CCGCD)	2 September 2024
Comal Trinity Groundwater Conservation District (CTGCD)	2 September 2024
Hays Trinity Groundwater Conservation District (HTGCD)	2 September 2024
Hill Country Underground Water Conservation District (HCUWCD)	2 September 2024
Southwestern Travis County Groundwater Conservation District (SWTCGCD)	2 September 2024

Appendix A.2. Blanco River Aquifers Assessment Tool (BRAAT) Project

The Blanco River flows eastward across the Texas (TX) Hill Country for about 140 km before joining the San Marcos River near San Marcos, TX, which is between New Braunfels and Austin, TX as shown on Figure 1. The Blanco River basin includes some of the fastest growing counties in TX and in the USA. During its journey, the Blanco River interacts with four aquifers: (1) Balcones Fault Zone (BFZ) Edwards Aquifer, (2) Upper Trinity Aquifer, (3) Middle Trinity Aquifer, and (4) Lower Trinity Aquifer. The study region is karst terrain, and these four aquifers are carbonate aquifers with secondary porosity and dissolution along discontinuities to produce sinkholes, caves, conduits, swallets, and other dissolution features [56,57,58,59,60,61,62,63]. Communication between the river and aquifers is of primary interest to water resource management. This system interaction is known to be exceptionally fast, direct, and mainly occurs where the river is in contact with karstic rock units and thus secondary porosity and dissolution features, which comprise the aquifers [21,22,23,24].

The initial phase of the BRAAT Project was development of a conceptual model of water movement and storage in the Blanco River basin among rivers and streams and aquifers. This conceptual model is documented in the “Conceptual Model Report” [17], and it provides the blueprint for integrated hydrological model development and application that occurred in the second phase of the project.

Primary water movement and storage elements in the conceptualization are two related hypotheses: (1) fast communication pathways in the karst terrain provide for extreme communication between surface water and groundwater systems and (2) groundwater movement and storage in the focus area is primarily controlled by geological structure within the aquifer systems. The two water movement and storage hypothesis are interrelated because geologic structures contribute to karst development [17].

The study area spans the BFZ, which is 25–50 km (15–31 mi) wide and has a maximum vertical displacement or fault throw of about 370 m (1200 ft) [64,65]. It is a system of mostly southeast-dipping, en echelon, normal faults that provides the transition from the Edwards Plateau in the western-most portion of the study area to the Texas Coastal Plain in the southeastern-most portion of the study area [66]. Figure A1 displays topographic elevation and a selection of faults, which are part of the BFZ, that have been mapped across the study area. Structural mapping in the region suggests that it contains multiple 3- to 11-km (2- to 7- mi) wide fault blocks bounded by large displacement normal faults with individual throws ranging from 30 to 265 m (100 to 850 ft) [64]. “Major Faults” shown on Figure A1 are large displacement normal fault zones. Hundreds of “Other Faults” are also shown on Figure A1, and normal displacement varies significantly among these “Other Faults”.

Figure A1. Topography, faults, and fault block simplification. The highest topographic elevations are in the western portion of the domain. Topography decreases moving east. The study area cuts across a portion of the Balcones Fault Zone (BFZ), which is an en echelon displacement framework. Topography decreases the most moving towards the south–east, perpendicular to the “Major Fault” traces. “Other Faults” are those mapped and inferred as part of high resolution, regional hydrostratigraphy studies [60,61,62,63]. “Major Faults” are the fault zones selected during conceptual model development to segment the study area into fault block regions [17]. The primary simplifications provided by fault block regions is isolation of stratigraphic offset (and uplift) in flow system conceptualization to seven focused zones.

Two primary threats to future water resources availability in the study area: (1) changing weather patterns, particularly increased drought frequency and magnitude from global climate change and (2) increased water demands in this rapidly growing and developing portion of the San Antonio–Austin corridor provided the impetus for BRAAT project initiation. Integrated hydrological water budget analysis was selected as the modeling output and model-based analysis providing consideration of these two threats given the water movement and storage conceptualization of (very) fast communication between surface waters and groundwater within a structurally controlled study area. Figure 3 displays the integrated hydrological water budgeting basins and regions used in the analysis. The Conceptual Model Report and other documents detailing the BRAAT project are available per Table A1. Data sources used in the BRAAT project are listed in Table A2 and Table A3. Table A4 provides labels for water budget regions.

Table A4. Labels for Water Budget Region Id shown on Figure 3.

Id ¹	Descriptive Label	Short Label ²
1	Balcones Fault Zone (BFZ) Edwards Artesian Region	BFZ Artesian
2	BFZ Edwards Fault Blocks Region	BFZ FBlocks
3	Barton Springs Pool Region	Barton Springs
4	Transition Fault Blocks Region	Trans FBlocks
5	Barton Springs Contributing Area Region	BS Contrib
6	Horst Block and Southern Wedge Region	Horst South
7	Far West Region	West
8 ³	Bounding Fault Zones Region	Fault Zones

¹ ID # is shown on Figure 3. ² “Short Label” is shown on Figure A5. ³ The Bounding Fault Zones Region is not contiguous and is not used for a budget volume in water budgeting calculations.

Appendix A.2.1. Observations

Two types of state observations provide information for assimilation through history matching: (1) water level elevations in wells and (2) stream discharge from gauging stations. Water elevations observed in wells are available by request from the agencies listed in Table A3. At the time of data processing for this study in 2021, there were no dedicated monitoring well or piezometer observations available in the study area. All water level observations are from wells that are, or may be, pumped with some unknown frequency. Additionally, lack of funding and subsequent data curation and quality control contribute to additional sources of error and uncertainty.

Therefore, we divided the water level elevation data set into two distinct categories for history matching: (1) monthly averaged (MA) and (2) point in time (PT) water levels. MA water levels are considered the “best” data set because observations are collected regularly at these locations, typically with daily or higher frequency. Recording many observations provides confidence that specious observations can be easily identified and filtered. This is a concern because only raw field observations are available. PT water levels are only observed one time a month or less frequently. PT water levels are expected to be less accurate and less precise because fewer measurements are made and raw field observations must be used with limited filtering and cleaning. Fewer measurements increases the likelihood for inclusion of specious observations.

Figure 2 displays the locations for 96 wells with water level observations within the study area. The Middle Trinity Aquifer has by far the largest number of wells in the focus area, with 53. The focus area is the “Integrated Focus Area” in Figure 2. The other three aquifers examined (BFZ Edwards Aquifer, Upper Trinity Aquifer, and Lower Trinity Aquifer) have three wells each within the focus area. For the Middle Trinity Aquifer, 53 wells in the focus area corresponds to an average density of 1 observation location per 40.5 km².

River and stream discharge observations also provide considerable information for history matching. 16 stream gauging stations are in the study area as shown on Figure 2. These stations are configured to provide the most relavant information related to fast communication pathways in the karst terrain between surface water and groundwater. Several of these stations have records that identify a sharp decrease in discharge moving downstream for medium to low flows and identify spatially localized “losing” conditions for river reaches. Stream discharge decreases in these locations because of direct communication with an aquifer; some of these stations are currently used to estimate recharge for the BFZ Edwards Aquifer [17,23,24].

Ref. [19] provides detailed analysis of these discharge observations for the BRAAT project. Limited information is presented here related to these data sets because the interested reader can access this documentation and information from Ref. [19]. For this paper, observation quality and expectations for measurement error are important considerations related to the new research presented. Water stage recorders, which measure the static water level, in the stream or river, and a rating curve are used to calculate discharge at each of these stations. Expected measurement errors for discharge calculated using a rating curve are ±50–100% for low flows, ±10–20% for medium or high in-bank flows, and ±40% for out of bank flows [20]. Discharge calculated using a rating curve from an observation of normal depth is an approximation because a rating curve provides an empirical, and poor quality, hydrodynamics model. Additionally, only two stations (out of 16) have Water Year (WY) 2021 data quality assessed as “good”. The other 14 stations have “fair” to “poor” data quality assessments. Low flows are important in the study area, occur more frequently than typically expected because of direct and fast communication between surface waters and groundwater, and provide the largest relative measurement error expectation, ±100% [19].

Appendix A.2.2. Integrated Hydrological Model Development

As part of the conceptualization phase, an integrated hydrological modeling approach was specified by project stakeholders with integrated denoting a full hydrologic cycle representation where surface and subsurface water flow and storage are dynamically linked and explicitly depicted [17]. During planning for integrated hydrological model implementation, project stakeholders selected the Hydrological Simulation Program–FORTRAN (HSPF) for representation of water movement and storage in watersheds, rivers, and creeks, and MODFLOW 6 for representation of groundwater movement and storage. HSPF and MODFLOW 6 were chosen because they are: (1) open source, (2) available for free, and (3) are commonly used in TX to facilitate model reuse. The MODFLOW 6 Unsaturated Zone Flow (UZF) advanced stress package was selected as the third model component to dynamically link surface (HSPF) and deep subsurface waters (MODFLOW 6).

HSPF and MODFLOW 6 are different conceptual representations implemented with completely different mathematics. HSPF is a collection of ordinary differential equations (ODEs) and empirical relationships that are linked into a fixed direction network representation of water movement and storage. MODFLOW 6 solves the groundwater flow system of partial differential equations (PDEs) across several mesh formulations. Because of these differences, both programs had to be modified slightly from a logistics standpoint to enable effective communication and information sharing. Science and engineering representations and calculations were maintained in both so that simulation results from logistically modified code bases agree with the existing and independent HSPF and MODFLOW 6 computer programs.

The coupling of HSPF and MODFLOW 6 was coded, implemented, and named pyHS2MF6 https://github.com/nmartin198/pyHS2MF6 (accessed on 10 February 2025) as part of Southwest Research Institute^® (SwRI^®) Internal Research and Development Project 15-R6015, completed in May 2020. A custom Queue Server Information Sharing Framework was developed to dynamically couple HSPF and MODFLOW 6 while maintaining science and engineering representations. Figure A2 provides a schematic showing the linkages among HSPF, UZF, and MODFLOW 6. The logistically modified version of HSPF is mHSPF, and the version of MODFLOW 6 wrapped in a custom Python interface to provide for communication with HSPF is pyMF6. Note that MODFLOW 6, and thus pyMF6, include the UZF advanced stress package. Additional information related to custom integrated hydrological model development is available per Appendix A.1 Table A1.

Figure A2. Conceptual coupling in pyHS2MF6. Near surface and groundwater model domains and linked boundary flows in pyHS2MF6. Arrows are shown representing flow between HSPF and MODFLOW 6. The spatial mapping component translates between the HSPF lumped parameter representation of watersheds and stream segments, represented with a single arrow in the diagram, and the three-dimensional, computational grid representation of MODFLOW 6, represented with multiple arrows in the diagram. pyHS2MF6 simulates and accounts for flows in both directions: (1) from HSPF to MODFLOW 6 and (2) from MODFLOW 6 to HSPF. UZF is the MODFLOW 6 unsaturated zone flow (UZF) stress package. RCHRES are well-mixed reservoir structures that represent rivers, streams, and reservoirs in HSPF. PERLND is the pervious portion of a watershed or sub-watershed. Reproduced from Ref. [67]. (CC-by-4.0).

Appendix A.2.3. Integrated Hydrological Model Implementation

Parameters from all three coupled models (HSPF, UZF, and MODFLOW 6) are adjusted during assimilation. These parameters are members of both

P (k)

and

P (k | h)

. 184 UZF parameter values were optimized during assimilation as shown in Table A5. During integrated hydrological model assimilation, 409 HSPF parameter values, all of which relate to empirical relationships between volume and discharge in hydrological routing, were optimized as listed in Table A5. These 409 parameters represent individual rows in 48 volume-to-discharge lookup tables, or FTABLES. Consequently, there are only 48 empirical lookup tables that internally have a few rows that vary during assimilation. MODFLOW 6, in contrast, has 4746 parameters constrained as part of DA, which are listed on Table A6. Regularization techniques allow reducing the number parameters from more than eleven million to less than five thousand.

Table A5. Summary of parameters adjusted during integrated hydrological model assimilation for HSPF and UZF.

Model	Parameter	Number Adjusted	Notes
HSPF	FTABLE ¹ discharge magnitudes for focused range of volume indices	409	There are two exits or ports from each of the 24 RCHRES ² that have a volume dependent solution and use the FTABLES. The “lower” volume table indices, generally corresponding to a stage of 1.5 to 10.0 ft, were adjusted as part of coupled model calibration.
UZF	Vertical, saturated hydraulic conductivity vks, which is similar to $K_{z z}$	92	UZF assumes vertical porous media flow using a method of characteristics solution. Parameterization is zone-based for regionalization with zones identified from HSPF RCHRES ² or PERLND ³ footprints. Gross adjustment range was 1.1 to 500.0 m/d.
UZF	Saturated water content, thts, which is similar to $S_{y}$	92	UZF assumes vertical porous media flow using a method of characteristics solution. Parameterization is zone-based for regionalization with zones identified from HSPF RCHRES ² or PERLND ³ footprints. Gross adjustment range was 0.10 to 0.35.

¹ FTABLE are lookup tables that portray the empirical relationship between depth, volume, and discharge for hydrological routing in HSPF. ² RCHRES are well-mixed water bodies in HSPF that are used to represent river reaches in the integrated hydrological model. ³ PERLND are pervious watershed land segments in HSPF.

Table A6. Summary of parameters adjusted during integrated hydrological data assimilation for MODFLOW 6.

Parameter	Number Adjusted	Notes
DRN ¹ Conductance	17	DRNs represent designated springs. One conductance value is applied to all DRN cells that represent a particular spring.
Zone-based hydraulic conductivity in x-direction, $K_{x x}$	121	Varies by water budget unit. Gross overall range is 0.04 to 100.0 m/d.
Zone-based hydraulic conductivity in y-direction, $K_{y y}$	121	Varies by water budget unit. Gross overall range is 0.04 to 100.0 m/d.
Zone-based hydraulic conductivity in z-direction, $K_{z z}$	99	Some important confining water budget units have z–direction pilot points and zone-based values for x- and y-direction and specific yield. Varies by water budget unit. Gross overall range is 0.01 to 42.0 m/d.
Zone-based specific yield, $S_{y}$	121	Varies by water budget unit. Gross overall range is 0.10 to 0.36. Specific storage, $S_{s}$ , calculated from $S_{y}$ using Equation (A1).
Pilot points (PP) hydraulic conductivity in x-direction, $K_{x x}$	875	Varies by water budget unit. Gross overall range is 0.8 to 141.5 m/d.
Pilot points (PP) hydraulic conductivity in y-direction, $K_{y y}$	875	Varies by water budget unit. Gross overall range is 0.8 to 141.5 m/d.
Pilot points (PP) hydraulic conductivity in z-direction, $K_{z z}$	1642	Some important confining water budget units have z–direction pilot points and zone-based values for x- and y-direction and specific yield. Varies by water budget unit. Gross overall range is 0.007 to 55.0 m/d.
Pilot points (PP) specific yield, $S_{y}$	875	Varies by water budget unit. Gross overall range is 0.10 to 0.45. Specific storage, $S_{s}$ , calculated from $S_{y}$ using Equation (A1).

¹ DRN refers to the Drain Package in MODFLOW 6.

In HSPF, a network of ODE continuity equations provides for hydrological routing of water across the study site. MODFLOW 6, in contrast, solves a system of PDEs for hydraulic head. ODEs have one dimension that changes, and water storage volume changes across simulation time in HSPF. PDEs represent changes in multiple dimensions, and hydraulic head changes across time and across the x, y, and z spatial dimensions in the Cartesian coordinate system of MODFLOW 6. Consequently, HSPF is a reduced dimensionality representation and has reduced complexity relative to MODFLOW 6. To further reduce the degrees of freedom in the integrated hydrological model assimilation, a standalone HSPF-only assimilation was conducted prior to the integrated model training. The watershed parameters listed in Table A7 were optimized during this preliminary assimilation and then fixed for integrated hydrological model assimilation.

Table A7. Parameters adjusted for HSPF standalone assimilation.

Parameter Name	Structure Type	Definition	Range ¹
Parameter Name	Structure Type	Definition	Max ²	Min ²
LZSN	PERLND ³	Lower zone nominal soil moisture storage in inches	9.0	3.0
INFILT	PERLND ³	Index to infiltration capacity, in/hr	0.4	0.001
AGWRC	PERLND ³	Base groundwater recession	0.999	0.85
DEEPFR	PERLND ³	Fraction of groundwater inflow to “deep recharge” ⁴	0.9	0.0
UZSN	PERLND ³	Upper zone nominal soil moisture storage in inches	2.0	0.05
INTFW	PERLND ³	Interflow inflow parameter	10.0	0.58
IRC	PERLND ³	Interflow recession parameter	0.85	0.3
RETSC	IMPLND ⁵	Retention storage capacity in inches	0.3	0.01

¹ Ranges for HSPF parameters are based on professional judgment and Ref. [32]. ² “English” or standard units are listed because mHSP2 can only use these units. ³ PERLND are pervious land segments in the watershed. ⁴ “Deep recharge” in HSPF documentation is not “recharge” according to the definitions employed for MODFLOW 6 and for this study. “DEEPFR” is the fraction of active groundwater inflow, which is subsurface storm flow, that goes to deep percolation and inactive groundwater inflow (IGWI) in HSPF. IGWI is sent to UZF are part of dynamic coupling. ⁵ IMPLND are impervious land segments in the watershed.

The goal for assimilation is to utilize regularization to set the level of model complexity to optimize the bias-variance tradeoff as shown graphically in Figure 5. When an ensemble of models is the result of assimilation, the ensemble provides extra variability to account for uncertainty as described graphically in Figure A3.

Figure A3. Conceptual schematic of observation error models in data assimilation (DA) and optimizing the bias–variance tradeoff. The prior parameter distribution, “Prior”, is estimated using professional judgment with the goal of uncovering a posterior parameter distribution, “Posterior”, that produces numerical model forecasts, which are congruent with observations and that has minimum variance and bias. Panel (a) represents the case of no observation error model in DA, where bias is a concern. Here, the range of target values for each history matching interval is narrower than the expected measurement and representation error, causing the inverse solution to overfit to a narrow range of relatively unlikely parameter values as shown on the left-side of panel (a). Variance denotes the spread of parameter values in the posterior. The goal of DA is to use observations to constrain, or narrow, the posterior relative to the prior. In panel (b), the introduction of observation uncertainty and the use of an observation error model promote a posterior with an expected value within the range of relatively likely prior values from professional judgment and with reduced variance, and thus reduced uncertainty, relative to the prior. Note that a single parameter and target are shown for illustrative purposes. Ensemble methods for DA work with hundreds to millions of parameters and hundreds to thousands of targets simultaneously. Reproduced from Ref. [19]. (CC-by-4.0).

In the MODFLOW 6 model, four regularization techniques reduce complexity by decreasing the number of assimilated parameters: (1) regionalization into the seven water budget regions shown on Figure 3, (2) regionalization into homogenous hydrostratigraphic zones within water budget regions for relatively low transmissivity units, (3) geostatistical interpolation within hydrostratigrapic zones for the four aquifers, and (4) use of the empirical relationship in Equation (A1) so that one storage parameter, instead of two, is included in optimization of parameter values. Zone-based hydrostratigraphic regionalization involves using a single parameter value for the collection of computational cells that are identified as part of the zone. Figure A4 identifies the hydrostratigraphic units in the near surface that are represented as zones for regularization. Figure A5 shows that distinct hydrostratigraphic zones exist in each water budget region. Table A8 and Table A9 identify the nesting of hydrostratigraphic zones within water budget regions.

Kriging with pilot points (PP) is the geostatistical interpolation method employed for regionalization, and thus regularization. Parameter values are varied as part of history matching for PP locations. A geostatistical model of variance with spatial offset, or variogram, is used in the kriging approach to interpolate values for all grid cells from designated pilot point locations. The result is a continuous parameter field within the hydrostratigraphic zones that have geostatistical interpolation.

Figure A4. Surficial hydrostratigraphy showing major fault zones. Surficial stratigraphy comes from the Geologic Atlas of Texas [68] with localized refinements based on the Hydrogeologic Atlas of the Hill Country Trinity Aquifer [56]. The surficial stratigraphy in the legend, with the exception of the fault zones, is ordered with youngest at top and increases in age moving downwards. Table A8 provides description of hydrostratigraphic units. Each unit in the legend represents regionalization into hydrostratigraphic zones.

Figure A5. Hydrostratigraphy across the Toms Creek Fault Zone from MODFLOW 6 mesh column 450 from station 32,000 to 35,500 m. The left side of the section is in Water Budget Region #6 as shown in Figure 3, and the right side is in Water Budget Region #4. This example section shows regionalization by water budget region and by hydrostratigraphic unit. Note that the “Major Fault Zones” from Figure A1 are given their own hydrostratigraphic zone to facilitate communication between water budget regions in the integrated hydrological model. Short labels for Water Budget Regions are provided on the location map inset at the top, and these short labels are defined in Table A4.

To reduce the number of parameters that must be optimized during assimilation, specific storage,

S_{s}

, is estimated from specified yield,

S_{y}

. Equation (A1) provides a linearly proportional estimate of

S_{s}

, based on the feasible range of

S_{s}

values from literature, to the specified value for

S_{y}

.

S_{s} = (\frac{S_{y} - S_{y, m i n}}{S_{y, m a x} - S_{y, m i n}} \times (S_{s, m a x} - S_{s, m i n})) + S_{s, m i n}

(A1)

In Equation (A1),

S_{s}

is specific storage with units of

m^{- 1}

,

S_{y}

is specific yield which is dimensionless,

S_{s, m a x} = 1.67 \times 10^{- 4} m^{- 1}

,

S_{s, m i n} = 3.33 \times 10^{- 7} m^{- 1}

,

S_{y, m a x} = 0.45

, and

S_{y, m i n} = 0.12

.

Table A8. Extended hydrostratigraphy IDs and description for groundwater flow model implementation. Hydrostratigraphy units are listed in increasing age.

Color Key ¹	ID	Name	Type	Purpose
ine	1	Alluvium	transition	At and near surface transition to groundwater
	2	HSG AB ²	transition	At and near surface transition to groundwater
	5	Karst	transition	At and near surface transition to groundwater
	6	Karst Transition	transition	At and near surface transition to groundwater
	11	Late K Confining	confining	Surficial deposits east and south of BFZ
	21	Plateau Edwards	NA	Minimal footprint and only in far west
	31	BFZ Edwards	primary aquifer	BFZ Edwards Aquifer
	32	Regional Dense (BFZ Edwards)	confining	Internal separation of BFZ Edwards
	33	Basal Nodular (BFZ Edwards)	confining	Separation from Upper Trinity Aquifer
	41	Upper Glen Rose (UGR)	aquifer	Upper Trinity Aquifer
	42	Upper Glen Rose, part of BFZ Edwards Aquifer	aquifer	BFZ Edwards Aquifer
	43	Lower Aquitard, Unit 3 (UGR)	confining	Separation from Lower Glen Rose
	46	Lower Glen Rose (LGR)	aquifer	part of Middle Trinity Aquifer
	47	Upper Aquitard (LGR)	confining	Internal separation from Cow Creek
	48	Reef Facies (LGR)	aquifer	part of Middle Trinity Aquifer [56]
	51	Hensel siliclastic facies	aquifer	Part of Middle Trinity Aquifer in the western part of the domain [69]
	52	Hensel, thin silty facies	confining	Separation of LGR and Reef Facies from Cow Creek [69]
	56	Cow Creek	aquifer	Most productive part of Middle Trinity Aquifer
	61	Hammett	confining	Separation between Middle and Lower Trinity Aquifers
	71	Sycamore or Hosston	aquifer	Lower Trinity Aquifer
	81	Undifferentiated Paleozoic	NA	Minor instances in the north near Pedernales River

¹ Color key corresponds to colors used on Figure A4 and Figure A5. ² “HSG AB” refers to hydrologic soil group (HSG) A and HSG B [70,71].

Table A9. Water budget unit and hydrostratigraphy ID mapping and regionalization.

Color Key ¹	Water Budget Unit ID ²	Hydrostratigraphy IDs ³	Description	Regionalization Method ⁴
ine	X0001	1	Alluvium	zone for all
	X0002	2	HSG AB	zone for all
	X0005	5, 6	Karst	zone for all
	X0011	11	Late K Confining	zone for all
	X0021	21	Plateau Edwards	zone for all
	X0031	31, 42	BFZ Edwards Aquifer	PP for all
	X0032	32, 33	Confining within BFZ Edwards	PP for $K_{z z}$ ; zone for $K_{x x}$ , $K_{y y}$ , and $S_{y}$
	X0041	41	Upper Trinity Aquifer	PP for $K_{z z}$ ; zone for $K_{x x}$ , $K_{y y}$ , and $S_{y}$
	X0042	43	Upper Glen Rose (UGR), confining	PP for $K_{z z}$ ; zone for $K_{x x}$ , $K_{y y}$ , and $S_{y}$
	X0051	46, 48	Lower Glen Rose (LGR), Middle Trinity Aquifer	PP for all
	X0052	47	Lower Glen Rose (LGR), confining	PP for $K_{z z}$ ; zone for $K_{x x}$ , $K_{y y}$ , and $S_{y}$
	X0053	56, 51	Cow Creek, Middle Trinity Aquifer	PP for all
	X0054	52	Hensel, confining	zone for all
	X0061	61	Hammett, confining	PP for $K_{z z}$ ; zone for $K_{x x}$ , $K_{y y}$ , and $S_{y}$
	X0071	71	Lower Trinity Aquifer	PP for all
	X0081	81	Undifferentiated Paleozoic	zone for all

¹ Color key is for the “Water Budget Unit ID” column and corresponds to the color key for the primary hydrostratigraphic unit by “Hydrostratigraphy ID” in Table A8. ² “Water Budget Unit ID” is a five digit integer label with the left-most digit, shown with “X” in this table, being the Water Budget Region ID #. ³ Hydrostratigraphic IDs are listed and defined in Table A8. ⁴ Regionalization method can be pilot points (PP) or zone and different methods can be used for different parameters within the same “Water Budget Unit ID”.

Appendix A.2.4. Integrated Hydrological Model Validation

History matching and validation are implemented jointly for the integrated hydrological model because of data scarcity. Primary data limitations are: (1) six of the 16 stream gauging stations (38% of the locations) started operation after 1 January 2015, and (2) the best spatial water level elevation resolution is 1 observation location per 40.5 km² on average and exists for the most recent historical times, no dedicated monitoring wells are available, and the temporal resolution among pumping and collection of water level observations is unknown.

The combined, or joint, history matching and validation simulation timeline is enumerated below. History matching is the sub-interval across which historic state observations are used to adjust parameter values to find the constrained posterior parameter distribution. The validation sub-interval is independent from the history matching sub-interval because parameters are not adjusted during the validation sub-interval. Rather, goodness-of-fit between simulated and observed values during the validation interval provides description of generalization skill.

Simulation and Burn-in Start: 2015–01–01
Burn-in End: 2016–06–30 (1.5 years of burn-in)
History Matching Start: 2016–07–01
History Matching End: 2019–12–31 (3.5 years of calibration)
Validation Start: 2020–01–01
Validation and Simulation End: 2021–12–31 (2.0 years of validation; 7.0 years of simulation)

Validation goodness-of-fit metrics are Nash–Sutcliffe Efficiency (NSE) [48,49] for discharge and Normalized Root Mean Square Error (NRMSE) for water levels in wells. The NRMSE is the root mean square error (RMSE) normalized by the observed range and converted to a percentage. The RMSE is defined in Equation (A2) where s are simulated values, o are observed values, and N is the count of compared values. NRMSE can range from zero to ∞ with zero representing perfect match.

R M S E = \sqrt{\frac{\sum_{i = 1}^{N} {(o_{i} - s_{i})}^{2}}{N}}

(A2)

Equation (A3) defines the NSE. It can range from

- \infty

to 1.0 with 1.0 being perfect match.

N S E = 1.0 - \frac{\sum_{i = 1}^{N} {(s_{i} - o_{i})}^{2}}{\sum_{i = 1}^{N} {(o_{i} - \bar{o})}^{2}}

(A3)

References

Evensen, G.; Vossepoel, F.; Jan van Leeuwen, P. Data Assimilation Fundamentals: A Unified Formulation of the State and Parameter Estimation Problem; Springer: Cham, Switzerland, 2022. [Google Scholar]
White, J.T. A model-independent iterative ensemble smoother for efficient history-matching and uncertainty quantification in very high dimensions. Environ. Model. Softw. 2018, 109, 191–201. [Google Scholar] [CrossRef]
Tonkin, M.; Doherty, J. Calibration-constrained Monte Carlo analysis of highly parameterized models using subspace techniques. Water Resour. Res. 2009, 45, W00B10. [Google Scholar] [CrossRef]
Doherty, J. Calibration and Uncertainty Analysis for Complex Environmental Models. PEST: Complete Theory and What It Means for Modelling the Real World; Watermark Numerical Computing: Brisbane, Australia, 2015. [Google Scholar]
Alstad, V.; Skogestad, S. Null Space Method for Selecting Optimal Measurement Combinations as Controlled Variables. Ind. Eng. Chem. Res. 2007, 46, 846–853. [Google Scholar] [CrossRef]
Zeng, G.L.; Gullberg, G.T. Null-Space Function Estimation for the Interior Problem. Phys. Med. Biol. 2012, 57, 1873–1887. [Google Scholar] [CrossRef]
Herckenrath, D.; Langevin, C.D.; Doherty, J. Predictive uncertainty analysis of a saltwater intrusion model using null-space Monte Carlo. Water Resour. Res. 2011, 47, 47. [Google Scholar] [CrossRef]
Yoon, H.; Hart, D.B.; McKenna, S.A. Parameter estimation and predictive uncertainty in stochastic inverse modeling of groundwater flow: Comparing null-space Monte Carlo and multiple starting point methods. Water Resour. Res. 2013, 49, 536–553. [Google Scholar] [CrossRef]
Tavakoli, R.; Yoon, H.; Delshad, M.; ElSheikh, A.H.; Wheeler, M.F.; Arnold, B.W. Comparison of ensemble filtering algorithms and null-space Monte Carlo for parameter estimation and uncertainty quantification using CO2 sequestration data. Water Resour. Res. 2013, 49, 8108–8127. [Google Scholar] [CrossRef]
Colombo, L.; Alberti, L.; Mazzon, P.; Antelmi, M. Null-Space Monte Carlo Particle Backtracking to Identify Groundwater Tetrachloroethylene Sources. Front. Environ. Sci. 2020, 8, 142. [Google Scholar] [CrossRef]
Moeck, C.; Molson, J.; Schirmer, M. Pathline Density Distributions in a Null-Space Monte Carlo Approach to Assess Groundwater Pathways. Groundwater 2020, 58, 189–207. [Google Scholar] [CrossRef]
Saad, S.; Javadi, A.A.; Farmani, R.; Sherif, M. Efficient uncertainty quantification for seawater intrusion prediction using Optimized sampling and Null Space Monte Carlo method. J. Hydrol. 2023, 620, 129496. [Google Scholar] [CrossRef]
Baalousha, H.M. Predictive uncertainty analysis for a highly parameterized karst aquifer using null-space Monte Carlo. Front. Water 2024, 6, 1384983. [Google Scholar] [CrossRef]
Moges, E.; Demissie, Y.; Li, H. Uncertainty propagation in coupled hydrological models using winding stairs and null-space Monte Carlo methods. J. Hydrol. 2020, 589, 125341. [Google Scholar] [CrossRef]
Hewett, R.J.; Heath, M.T.; Butala, M.D.; Kamalabadi, F. A Robust Null Space Method for Linear Equality Constrained State Estimation. IEEE Trans. Signal Process. 2010, 58, 3961–3971. [Google Scholar] [CrossRef]
Yang, Y.; Maley, J.; Huang, G. Null-space-based marginalization: Analysis and algorithm. In Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada, 24–28 September 2017; pp. 6749–6755, ISSN 2153-0866. [Google Scholar] [CrossRef]
Martin, N.; Green, R.T.; Nicholaides, K.; Fratesi, S.B.; Nunu, R.R.; Flores, M.E. Blanco River Aquifer Assessment Tool—A Tool to Assess How the Blanco River Interacts with Its Aquifers: Creating the Conceptual Model; Technical Report; Meadows Center for Water and the Environment, Texas State University: San Marcos, TX, USA, 2019. [Google Scholar]
TWDB. Texas Instream Flow Program, 2024. Available online: https://www.twdb.texas.gov/surfacewater/flows/instream/index.asp (accessed on 11 June 2024).
Martin, N.; White, J. Flow Regime-Dependent, Discharge Uncertainty Envelope for Uncertainty Analysis with Ensemble Methods. Water 2023, 15, 1133. [Google Scholar] [CrossRef]
McMillan, H.; Krueger, T.; Freer, J. Benchmarking observational uncertainties for hydrology: Rainfall, river discharge and water quality. Hydrol. Process. 2012, 26, 4078–4111. [Google Scholar] [CrossRef]
Slade, R.M. A recharge-discharge water budget and evaluation of water budgets for the Edwards Aquifer associated with Barton Springs. Tex. Water J. 2017, 8, 42–56. [Google Scholar] [CrossRef]
Slade, R.M. Documentation of a recharge-discharge water budget and main-streambed recharge volumes, and fundamental evaluation of groundwater tracer studies for the Barton Springs segment of the Edwards Aquifer. Tex. Water J. 2014, 5, 12–23. [Google Scholar] [CrossRef]
HDR Engineering, Inc.; Paul Price Associates, Inc.; LBG-Guyton Associates; Fugro-McClelland (SW), Inc. Trans-Texas Water Program, West Central Study Area Phase II: Edwards Aquifer Recharge Analyses; Technical Report; San Antonio River Authority and Others: San Antonio, TX, USA, 1998. [Google Scholar]
Puente, C. Method of Estimating Natural Recharge to the Edwards Aquifer in the San Antonio Area, Texas; Water-Resources Investigation Report 78-0010; U.S Geological Survey: Austin, TX, USA, 1978.
Orszag, S.A. Analytical theories of turbulence. J. Fluid Mech. 1970, 41, 363–386. [Google Scholar] [CrossRef]
Ciofalo, M. Direct Numerical Simulation (DNS). In Thermofluid Dynamics of Turbulent Flows: Fundamentals and Modelling; Ciofalo, M., Ed.; UNIPA Springer Series; Springer International Publishing: Cham, Switzerland, 2022; pp. 37–46. [Google Scholar] [CrossRef]
Jungbecker, P.; Veit, D. Computational fluid dynamics (CFD) and its application to textile technology. In Simulation in Textile Technology; Veit, D., Ed.; Woodhead Publishing Series in Textiles; Woodhead Publishing: Sawston, UK, 2012; pp. 142–171, 172e–178e. [Google Scholar] [CrossRef]
Roelofs, F.; Shams, A. CFD—Introduction. In Thermal Hydraulics Aspects of Liquid Metal Cooled Nuclear Reactors; Roelofs, F., Ed.; Woodhead Publishing: Sawston, UK, 2019; pp. 213–218. [Google Scholar] [CrossRef]
Casulli, V.; Cattani, E. Stability, accuracy and efficiency of a semi-implicit method for three-dimensional shallow water flow. Comput. Math. Appl. 1994, 27, 99–112. [Google Scholar] [CrossRef]
Casulli, V.; Cheng, R.T. Semi-implicit finite difference methods for three-dimensional shallow water flow. Int. J. Numer. Methods Fluids 1992, 15, 629–648. [Google Scholar] [CrossRef]
Chow, V.T.; Maidment, D.R.; Mays, L.W. Applied Hydrology; McGraw-Hill Education: New York, NY, USA, 1988. [Google Scholar]
US EPA. BASINS Technical Note 6: Estimating Hydrology and Hydraulic Parameters for HSPF; Technical Note EPA-823-R00-012; US EPA: Washington, DC, USA, 2000.
Langevin, C.D.; Hughes, J.D.; Banta, E.R.; Provost, A.M.; Niswonger, R.G.; Panday, S. MODFLOW 6 Modular Hydrologic Model Version 6.1.1; U.S. Geological Survey: Reston, VA, USA, 2019. [CrossRef]
U.S. Geological Survey. MODFLOW 6: USGS Modular Hydrologic Model; U.S. Geological Survey: Reston, VA, USA, 2020.
Freeze, A.R.; Cherry, J.M. Groundwater; Pearson: London, UK, 1979. [Google Scholar]
Bear, J. Dynamics of Fluids in Porous Media, Unabridged Reprint ed.; Dover Science Books, Dover Publications, Inc.: New York, NY, USA, 1988; Original Publication: 1972 American Elsevier Publishing Company, Inc. [Google Scholar]
Pest++ Development Team. PEST++: Software Suite for Parameter Estimation, Uncertainty Quantification, Management Optimization, and Sensitivity Analysis. Version 5.1.18. User Manual. Pest++ Development Team. [Online]. 2022. Available online: https://github.com/usgs/pestpp (accessed on 26 June 2024).
White, J.T.; Hunt, R.J.; Fienen, M.N.; Doherty, J. Approaches to Highly Parameterized Inversion: PEST++ Version 5, a Software Suite for Parameter Estimation, Uncertainty Analysis, Management Optimization and Sensitivity Analysis; Techniques and Methods 7C26; U.S. Geological Survey: Reston, VA, USA, 2020.
Chen, Y.; Oliver, D.S. Levenberg–Marquardt forms of the iterative ensemble smoother for efficient history matching and uncertainty quantification. Comput. Geosci. 2013, 17, 689–703. [Google Scholar] [CrossRef]
Doherty, J. PEST Model-Independent Parameter Estimation, User Manual Part I: PEST, SENSAN and Global Optimisers, User Manual 7th ed.; Watermark Numerical Computing: Brisbane, Australia, 2020. [Google Scholar]
Evans, M.; Moshonov, H. Checking for prior-data conflict. Bayesian Anal. 2006, 1, 893–914. [Google Scholar] [CrossRef]
Alfonzo, M.; Oliver, D.S. Evaluating prior predictions of production and seismic data. Comput. Geosci. 2019, 23, 1331–1347. [Google Scholar] [CrossRef]
Oliver, D.S. Diagnosing reservoir model deficiency for model improvement. J. Pet. Sci. Eng. 2020, 193, 107367. [Google Scholar] [CrossRef]
Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed.; Springer Science+Business Media: New York, NY, USA, 2016. [Google Scholar] [CrossRef]
Chollet, F. Deep Learning with Python, 2nd ed.; Manning Publications Company: Shelter Island, NY, USA, 2021. [Google Scholar]
Martin, N.; White, J. Water Resources’ AI–ML Data Uncertainty Risk and Mitigation Using Data Assimilation. Water 2024, 16, 2758. [Google Scholar] [CrossRef]
ASTM International. Standard Guide for Conducting a Sensitivity Analysis for a Groundwater Flow Model Application; ASTM International: West Conshohocken, PA, USA, 2016. [Google Scholar]
Nash, J.E.; Sutcliffe, J.V. River flow forecasting through conceptual models part I—A discussion of principles. J. Hydrol. 1970, 10, 282–290. [Google Scholar] [CrossRef]
Legates, D.R.; McCabe, G.J. Evaluating the use of “goodness-of-fit” measures in hydrologic and hydroclimatic model validation. Water Resour. Res. 1999, 35, 233–241. [Google Scholar] [CrossRef]
Anderson, M.P.; Woessner, W.W. Applied Groundwater Modeling: Simulation of Flow and Advective Transport, 1st ed.; Academic Press: San Diego, CA, USA, 1991. [Google Scholar]
Martin, N. Watershed-Scale, Probabilistic Risk Assessment of Water Resources Impacts from Climate Change. Water 2021, 13, 40. [Google Scholar] [CrossRef]
Martin, N. Risk Assessment of Future Climate and Land Use/Land Cover Change Impacts on Water Resources. Hydrology 2021, 8, 38. [Google Scholar] [CrossRef]
Karniadakis, G.E.; Kevrekidis, I.G.; Lu, L.; Perdikaris, P.; Wang, S.; Yang, L. Physics-informed machine learning. Nat. Rev. Phys. 2021, 3, 422–440. [Google Scholar] [CrossRef]
Delottier, H.; Doherty, J.; Brunner, P. Data space inversion for efficient uncertainty quantification using an integrated surface and sub-surface hydrologic model. Geosci. Model Dev. 2023, 16, 4213–4231. [Google Scholar] [CrossRef]
Bala, B.K.; Arshad, F.M.; Noh, K.M. System Dynamics: Modelling and Simulation; Springer Texts in Business and Economics; Springer: Singapore, 2017. [Google Scholar]
Wierman, D.; Broun, A.; Hunt, B. Hydrogeologic Atlas of the Hill Country Trinity Aquifer: Blanco, Hays, and Travis Counties, Central Texas; Hydrogeologic Atlas; The Meadows Center for Water and the Environment—Texas State University: San Marcos, TX, USA, 2010. [Google Scholar]
Hunt, B.; Smith, B.; Andrews, A.; Wierman, D.; Broun, A.; Gary, M. Relay Ramp Structures and their Influence on Groundwater Flow in the Edwards and Trinity Aquifers, Hays and Travis Counties, Central Texas. In Proceedings of the 14th Multidisciplinary Conference on Sinkholes and the Engineering and Environmental Impacts of Karst, Carlsbad, NM, USA, 5–9 October 2015; pp. 189–200. [Google Scholar]
Hunt, B.B.; Smith, B.A.; Gary, M.O.; Broun, A.S.; Wierman, D.A.; Watson, J.; Johns, D. Surface-water and Groundwater Interactions in the Blanco River and Onion Creek Watersheds: Implications for the Trinity and Edwards Aquifers of Central Texas. Bull. South Tex. Geol. Soc. 2017, 47, 33–53. [Google Scholar]
Smith, B.A.; Hunt, B.B.; Wierman, D.A.; Gary, M.O. Groundwater Flow Systems in Multiple Karst Aquifers of Central Texas. In Proceedings of the 15th Multidisciplinary Conference on Sinkholes and the Engineering and Environmental Impacts of Karst: National Cave and Karst Research Institute (NCKRI), Carlsbad, NM, USA, 2–6 April 2018; pp. 17–29. [Google Scholar]
Clark, A.K.; Golab, J.A.; Morris, R.E. Geologic Framework, Hydrostratigraphy, and Ichnology of the Blanco, Payton, and Rough Hollow 7.5-Minute Quadrangles, Blanco, Comal, Hays, and Kendall Counties, Texas; USGS Numbered Series 3363; U.S. Geological Survey: Reston, VA, USA, 2016.
Clark, A.K.; Golab, J.A.; Morris, R.R. Geologic Framework and Hydrostratigraphy of the Edwards and Trinity Aquifers Within Northern Bexar and Comal Counties, Texas; USGS Numbered Series 3366; U.S. Geological Survey: Reston, VA, USA, 2016.
Clark, A.K.; Morris, R.R. Bedrock Geology and Hydrostratigraphy of the Edwards and Trinity Aquifers Within the Driftwood and Wimberley 7.5-Minute Quadrangles, Hays and Comal Counties, Texas; USGS Numbered Series 3386; U.S. Geological Survey: Reston, VA, USA, 2017.
Clark, A.K.; Pedraza, D.E.; Morris, R.R. Geologic Framework and Hydrostratigraphy of the Edwards and Trinity Aquifers Within Hays County, Texas; USGS Numbered Series 3418; U.S. Geological Survey: Reston, VA, USA, 2018.
Collins, E.; Hvorka, S. Structure Map of the San Antonio Segment of the Edwards Aquifer; Miscellaneous Map No. 38; University of Texas Bureau of Economic Geology (BEG): Austin, TX, USA, 1997. [Google Scholar]
Ferrill, D.A.; Morris, A.P.; McGinnis, R.N. Geologic structure of the Edwards (Balcones Fault Zone) Aquifer. In The Edwards Aquifer: The Past, Present, and Future of a Vital Water Resource; GSA Memoirs; Sharp, J.M., Jr., Green, R.T., Schindel, G.M., Eds.; Geological Society of America: Boulder, CO, USA, 2019; Volume 215, p. 18. [Google Scholar] [CrossRef]
Ferguson, W. The Blanco River; Texas A&M University Press: College Station, TX, USA, 2017. [Google Scholar]
Martin, N. pyHS2MF6 Docs. 2021. Available online: https://nmartin198.github.io/pyHS2MF6/ (accessed on 2 March 2025).
TexasWater Development Board; U.S. Geological Survey. Geologic Atlas of Texas; Bureau of Economic Geology (BEG): Austin, TX, USA, 2007. Available online: https://www.twdb.texas.gov/groundwater/aquifer/GAT/ (accessed on 27 April 2025).
Broun, A.S.; Hunt, B.B.; Watson, J.A.; Wierman, D.A. A Reevaluation of the Lithostratigraphy and Facies of the Lower Cretaceous Hensel Sand–Lower Glen Rose Transitions, Comanchean Shelf, Blanco and Hays Counties, Central Texas. GCAGS J. 2020, 9, 76–92. [Google Scholar]
Soil Survey Staff, Natural Resources Conservation Service, United States Department of Agriculture. Web Soil Survey, 2019. Available online: https://websoilsurvey.nrcs.usda.gov/app/ (accessed on 14 April 2021).
Natural Resources Conservation Service (NRCS). Hydrologic Soil Groups. In National Engineering Handbook (NEH) Part 630, Hydrology; U.S. Department of Agriculture: Washington, DC, USA, 2007; p. 14. [Google Scholar]

Figure 1. Study area for the BRAAT project. “Important Springs” are the major spring discharge locations within the study area which approximately correspond to a stream gauging observation location. “Instream Flow Locations” are stream gauging stations where the Texas Instream Flow Program (TIFP) [18] conducts formal studies to determine the amount of water required to maintain a sound ecological environment. “Future Pumping Centers” are proposed regions of concern for future groundwater withdrawals [17]. The model domains, for HSPF and MODFLOW 6, were identified as part of conceptualization [17].

Figure 2. Discharge and water level observation target locations. These are the locations of water level observations, in wells and the stream gauging stations used for history matching. Hays Trinity Groundwater Conservation District (HTGCD) well ID 729 is one of the water level observation locations.

Figure 3. Integrated hydrological water budget regions. “Water Budget Gauges” are stream gauging stations located downstream of the Balcones Fault Zone (BFZ) Edwards Aquifer recharge zone. Each gauging station shown provides the outlet to one of the surface and near-surface water budgeting basins “Blanco River Water Budget Basin” and “Onion Creek Water Budget Basin”. “Water Budget Regions”, labeled #1–7, are subsurface volumes that have a groundwater budget calculation as part of the analysis. These regions were identified based on fault block conceptualization for the Balcones Fault Zone (BFZ), and “Bounding Faults” are the major fault zones shown in Figure A1 that were employed to segregate the water budget regions. Descriptive labels for “Water Budget Regions” are provided in Table A4.

Figure 4. Delineation of hydrological and hydrodynamical models based on the degree of physics-based representation in the algorithms. Direct numerical simulation (DNS) represents the full physics of water movement without simplification or submodels. Reynolds-averaged Navier–Stokes (RANS) algorithms employ time- and space-based averaging to provide somewhat simplified physics-based description, relative to DNS, with a turbulence submodel. DNS and RANS are characterized as hydrodynamical models and conserve momentum. Hydrological models, in contrast, employ empirical relationships and make simplifying assumptions to only represent a subset of the forces driving water movement and so can only provide approximations and not “accuracy”. The models used in this study are hydrological models and include the Hydrological Simulation Program–FORTRAN (HSPF), the MODFLOW 6 groundwater flow model, and empirical rating curves for river discharge estimation. HSPF is composed of empirical relationships, and MODFLOW 6 uses the empirical Darcy’s equation for fluid flux. Neither HSPF or MODFLOW 6 can represent momentum or the fluid forces required to change momentum. Complexity of representation increases from the left and the completely empirical rating curve to the right based on the complete theoretical physics-based representation of DNS. Because RANS models contain sufficient physics-based representation for derivation of algorithm accuracy and stability relative to the governing physics, the term “calibration” is preferred for selecting best-fit parameters using history matching for these models. For hydrological models, data assimilation (DA) should be used for the selection of optimal parameters because hydrological models provide only approximations and will generate significant model representation error.

Figure 5. Bias–variance tradeoff and model complexity in AI–ML and data assimilation (DA). In (A), “Model Space” is all model predictions, and the “best-fit” model predictions are labeled and identified. “Truth” represents the population of observations. “Model Bias” is the statistical distance between best-fit values and expected “Truth” and represents generalization error. Note that the bias shown with the dashed line between the “Best fit in training” and “Realization or Sample Truth” is the prediction error. “Estimation Variance” for the best-fit model prediction is shown with a red circle, and the radius of the circle denotes variance magnitude. “Model Space with Regularization” is also shown, and the “Estimation Variance” for the best-fit regularized model prediction is relatively smaller, as shown by the radius of the green circle. An additional component of “Estimation Bias” is identified for the best-fit regularized model prediction relative to the best-fit model prediction, denoting that a larger “Estimation Bias” is the cost for a smaller “Estimation Variance” with regularization. (B) provides the analogous schematic for DA with data and model uncertainty. Here, regularization techniques are typically used to limit complexity, which will result in a larger relative “Estimation Bias”. Data and model prediction uncertainty are explicitly represented using an observation error model to increase the “Estimation Variance”, accounting for uncertainty. In other words, DA using a forward model with regularization and an observation error model functions to increase bias and variance to account for uncertainty, which also limits the possibility of overfitting. When the data set comprises calculated or estimated values, rather than observed values, the data set may be a biased sample of the underlying population. In this case, “Biased Truth Estimate” represents the data set used for training, testing, and validation, and the magnitude of “Measurement Error (Bias)” is uncertain because the sample does not accurately represent the underlying population, or “Truth”. The measurement error and representation error components of an observation error model are identified conceptually with the “halos” around “Biased Truth Estimate” and “Model Space with Regularization”, respectively. Note that it is an “expected” measurement error in the observation error model because the “true” measurement error is unknown, as the biased sample does not accurately portray the underlying population. Reproduced from Ref. [46]. (CC-by-4.0).

Figure 6. Pilot point (PP) locations used for geostatistical interpolation to implement regularization. PP location IDs 16 and 38 are identified in Water Budget Region #4 for reference.

Figure 7. Sampled adjustments to observations at Hayes Trinity Groundwater Conservation District (HTGCD) well ID 729, the location of which is shown in Figure 2. Thirty monthly-averaged observations are used in history matching for assimilation at this location. The observation error model for all water level observations is a spatially and temporally constant Gaussian variate defined with a mean (

μ

) of zero and a standard deviation (

σ

) of 9.14 m. Sample values are shown for 30 monthly-averaged time points and 300 realizations. Observed water level observations during the assimilation interval range from 222.4 to 235.5 m, and the sampled adjustments range from —36.0 to 37.2 m, showing that the adjustment value may exceed 15% of the measured elevation.

Figure 7. Sampled adjustments to observations at Hayes Trinity Groundwater Conservation District (HTGCD) well ID 729, the location of which is shown in Figure 2. Thirty monthly-averaged observations are used in history matching for assimilation at this location. The observation error model for all water level observations is a spatially and temporally constant Gaussian variate defined with a mean (

μ

) of zero and a standard deviation (

σ

) of 9.14 m. Sample values are shown for 30 monthly-averaged time points and 300 realizations. Observed water level observations during the assimilation interval range from 222.4 to 235.5 m, and the sampled adjustments range from —36.0 to 37.2 m, showing that the adjustment value may exceed 15% of the measured elevation.

Figure 8. Specific yield,

S_{y}

, samples during standalone and coupled assimilations for a sensitive parameter. The parameter shown is

S_{y}

for pilot point (PP) index 38, as shown in Figure 6 and located in Water Budget Region #4. For standalone assimilation, the lower boundary was 0.12, and the upper boundary was 0.45. During coupled assimilation, the

B n d_{L}

was 0.149, and

B n d_{U}

was 0.351.

I Q R_{101}

was 0.033 with a 25th percentile value of 0.184 and a 75th percentile value of 0.217. The resulting

S_{e}

was 6.1.

Figure 8. Specific yield,

S_{y}

, samples during standalone and coupled assimilations for a sensitive parameter. The parameter shown is

S_{y}

for pilot point (PP) index 38, as shown in Figure 6 and located in Water Budget Region #4. For standalone assimilation, the lower boundary was 0.12, and the upper boundary was 0.45. During coupled assimilation, the

B n d_{L}

was 0.149, and

B n d_{U}

was 0.351.

I Q R_{101}

was 0.033 with a 25th percentile value of 0.184 and a 75th percentile value of 0.217. The resulting

S_{e}

was 6.1.

Figure 9. Specific yield,

S_{y}

, samples during standalone and coupled assimilations for an insensitive parameter. The parameter shown is

S_{y}

for pilot point (PP) index 16, shown in Figure 6, located in Water Budget Region #4. For standalone assimilation, the lower boundary was 0.12, and the upper boundary was 0.45. During coupled assimilation, the

B n d_{L}

was 0.256, and

B n d_{U}

was 0.428.

I Q R_{101}

was 0.047 with a 25th percentile value of 0.362 and a 75th percentile value of 0.409. The resulting

S_{e}

was 3.7.

Figure 9. Specific yield,

S_{y}

, samples during standalone and coupled assimilations for an insensitive parameter. The parameter shown is

S_{y}

for pilot point (PP) index 16, shown in Figure 6, located in Water Budget Region #4. For standalone assimilation, the lower boundary was 0.12, and the upper boundary was 0.45. During coupled assimilation, the

B n d_{L}

was 0.256, and

B n d_{U}

was 0.428.

I Q R_{101}

was 0.047 with a 25th percentile value of 0.362 and a 75th percentile value of 0.409. The resulting

S_{e}

was 3.7.

Table 1. Validation of the final model’s goodness-of-fit with prior-data conflicts removed.

Goodness-of-Fit Metric		Metric Value	Validation Range
Goodness-of-Fit Metric		Metric Value	Minimum	Maximum
NRMSE ¹		5.8%	0%	10%
NSE ²	4595	0.6	0.6	0.8
	7817	0.2	—0.9	0.5
	8155200	0.8 ³	0.6	0.7
	8158700	0.7	0.6	0.7
	8158810	0.7 ³	0.5	0.6
	8158813	0.3	—0.3	0.3
	8158827	0.7	0.6	0.8
	8170500	0.4 ³	—6.4	—4.3
	8170890	0.1 ⁴	0.7	0.8
	8170950	0.5	0.1	0.6
	8170990	0.6	0.3	0.4
	8171000	0.6	0.6	0.7
	8171290	0.7 ³	—0.2	0.3
	8171300	0.6	0.6	0.8
	8171350	0.5	0.3	0.9
	8171400	0.3 ³	—0.9	0.2
	Total Major Springs ⁵	1.5	1.0 ⁶	1.9 ⁶
	Total Major Focus Springs ⁷	1.2 ³	0.3 ⁶	1.0 ⁶
	Total All Gauging Stations	8.2	4.0 ⁶	10.3 ⁶

¹ The normalized root mean square error (NRMSE) is defined in Appendix A in conjunction with Equation (A2). A historically acceptable range of NRMSE for groundwater model validation is <10% [50]. ² Nash–Sutcliffe efficiency (NSE) [48,49] are defined in Equation (A3). The minimum and maximum range for each gauging station is determined from the biased variate NSE calculated as part of uncertainty envelope analysis in Ref. [19]. ³ When the calculated goodness-of-fit metric is better than the maximum from Ref. [19], it is possible that model is overfit to the gauging station observations. ⁴ Station 8170890 is the only location where the validation NSE is less than the validation minimum. While not an ideal result, this is a relatively new gauging station on the Little Blanco River in the upper watershed with a short record, and it is one of the least important target locations for history matching. ⁵ “Total Major Springs” is the sum of 8170500 (San Marcos Springs), 8170950 (Pleasant Valley Springs), and 8170990 (Jacob’s Well Spring). ⁶ The totals across groups of gauging stations have minimum and maximum thresholds calculated using the unbiased variate NSE values for 8170500 San Marcos Springs in Ref. [19]. The unbiased variate minimum NSE for 8170500 is 0.6, and the maximum is 0.9 [19]. ⁷ The “Major Focus Springs” are 8170950 (Pleasant Valley Springs), and 8170990 (Jacob’s Well Spring) which are located within the “Integrated Focus Area” in Figure 2.

Table 2. Goodness-of-fit validation metrics showing that the sensitivity analysis ensembles and final model are equally validated.

Goodness-of-Fit		Final ¹	S1	S2	S3	S4	S5
NRMSE ine	All Well Water Elevations	5.8%	5.8%	5.8%	5.8%	5.8%	5.8%
NSE	4595	0.6	0.6	0.6	0.6	0.6	0.6
	7817	0.2	0.2	0.2	0.2	0.2	0.2
	8155200	0.8	0.8	0.8	0.8	0.8	0.8
	8158700	0.7	0.7	0.7	0.7	0.7	0.7
	8158810	0.7	0.7	0.7	0.7	0.7	0.7
	8158813	0.3	0.3	0.3	0.3	0.3	0.3
	8158827	0.7	0.7	0.7	0.7	0.7	0.7
	8170500	0.4	0.3	0.4	0.4	0.4	0.4
	8170890	0.1	0.1	0.1	0.1	0.1	0.1
	8170950	0.5	0.6	0.5	0.5	0.5	0.6
	8170990	0.6	0.6	0.6	0.6	0.6	0.6
	8171000	0.6	0.7	0.6	0.6	0.6	0.6
	8171290	0.7	0.7	0.7	0.7	0.7	0.7
	8171300	0.6	0.6	0.6	0.6	0.6	0.6
	8171350	0.5	0.5	0.5	0.5	0.5	0.5
	8171400	0.3	0.3	0.3	0.3	0.3	0.4
	Total Major Springs ²	1.5	1.6	1.5	1.5	1.5	1.6
	Total All Gauging Stations	8.2	8.4	8.2	8.2	8.3	8.3

¹ Final model is labeled “Final”. ² Total major springs equal 8170500, 8170950, and 8170990.

Table 3. Examples of changes to model conclusions across six equally validated models.

Model Prediction		Final ¹	S1	S2	S3	S4	S5
Average Simulated Water Storage Volume during 2020 in Middle Trinity Aquifer, excluding Cow Creek	Water Budget Region #4 Storage (Mm³)	6344	4782	6032	7237	8410	9557
	Water Budget Region #4 residence time multiplier ²	1.33	1.00	1.26	1.51	1.76	2.00
	Water Budget Region #5 Storage (Mm³)	8802	7139	8767	10,321	11,820	13,278
	Water Budget Region #5 residence time multiplier ²	1.23	1.00	1.23	1.45	1.66	1.86
	Water Budget Region #6 Storage (Mm³)	7575	6675	7722	8720	9675	10,595
	Water Budget Region #6 residence time multiplier ²	1.13	1.00	1.16	1.31	1.45	1.59
Average Simulated Water Storage Volume during 2020 in Cow Creek	Water Budget Region #4 Storage (Mm³)	2583	1731	2058	2367	2664	2952
	Water Budget Region #4 residence time multiplier ²	1.49	1.00	1.19	1.37	1.54	1.71
	Water Budget Region #5 Storage (Mm³)	9305	6022	7606	9150	10,664	12,155
	Water Budget Region #5 residence time multiplier ²	1.55	1.00	1.26	1.52	1.77	2.02
	Water Budget Region #6 Storage (Mm³)	4235	2700	3302	3871	4417	4945
	Water Budget Region #6 residence time multiplier ²	1.57	1.00	1.22	1.43	1.64	1.83

¹ Final model is labeled “Final”. ² Residence time multiplier is defined in Equation (4).

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Martin, N.; White, J.; Southard, P. A Null Space Sensitivity Analysis for Hydrological Data Assimilation with Ensemble Methods. Hydrology 2025, 12, 106. https://doi.org/10.3390/hydrology12050106

AMA Style

Martin N, White J, Southard P. A Null Space Sensitivity Analysis for Hydrological Data Assimilation with Ensemble Methods. Hydrology. 2025; 12(5):106. https://doi.org/10.3390/hydrology12050106

Chicago/Turabian Style

Martin, Nick, Jeremy White, and Paul Southard. 2025. "A Null Space Sensitivity Analysis for Hydrological Data Assimilation with Ensemble Methods" Hydrology 12, no. 5: 106. https://doi.org/10.3390/hydrology12050106

APA Style

Martin, N., White, J., & Southard, P. (2025). A Null Space Sensitivity Analysis for Hydrological Data Assimilation with Ensemble Methods. Hydrology, 12(5), 106. https://doi.org/10.3390/hydrology12050106

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Null Space Sensitivity Analysis for Hydrological Data Assimilation with Ensemble Methods

Abstract

1. Introduction

2. Data and Methods

2.1. Study Site, Observations, and Models

2.2. Hydrological and Hydrodynamical Model Representations

2.3. Data Assimilation (DA)

2.3.1. Ensemble Methods

2.3.2. Observation Error Models

2.3.3. Complexity and the Bias–Variance Tradeoff

2.4. Null Space Sensitivity Analysis

3. Results

3.1. Validation of Assimilation

3.2. Empirical Null Space Sensitivity Analysis for Ensemble Methods

4. Discussion

4.1. Model Complexity and Bias–Variance Tradeoff

4.2. Addressing Uncertainty from Imperfect Models and Insufficient Data

4.3. Future Work

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A

Appendix A.1. Data Availability Tables

Appendix A.2. Blanco River Aquifers Assessment Tool (BRAAT) Project

Appendix A.2.1. Observations

Appendix A.2.2. Integrated Hydrological Model Development

Appendix A.2.3. Integrated Hydrological Model Implementation

Appendix A.2.4. Integrated Hydrological Model Validation

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI