Next Article in Journal
Trends and Perspectives of Marine Sports Tourism: A Bibliometric Analysis and Systematic Review
Previous Article in Journal
Displacement Interval Prediction Method for Arch Dam with Cracks: Integrated STL, MF-DFA and Bootstrap
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Water Resources’ AI–ML Data Uncertainty Risk and Mitigation Using Data Assimilation

by
Nick Martin
1,2,* and
Jeremy White
3
1
Vodanube LLC, Fort Collins, CO 80525, USA
2
RESPEC, Fort Collins, CO 80525, USA
3
INTERA Incorporated, Fort Collins, CO 80524, USA
*
Author to whom correspondence should be addressed.
Water 2024, 16(19), 2758; https://doi.org/10.3390/w16192758
Submission received: 10 August 2024 / Revised: 21 September 2024 / Accepted: 25 September 2024 / Published: 27 September 2024
(This article belongs to the Section Water Resources Management, Policy and Governance)

Abstract

Artificial intelligence (AI), including machine learning (ML) and deep learning (DL), learns by training and is restricted by the amount and quality of training data. Training involves a tradeoff between prediction bias and variance controlled by model complexity. Increased model complexity decreases prediction bias, increases variance, and increases overfitting possibilities. Overfitting is a significantly smaller training prediction error relative to the trained model prediction error for an independent validation set. Uncertain data generate risks for AI–ML because they increase overfitting and limit generalization ability. Specious confidence in predictions from overfit models with limited generalization ability, leading to misguided water resource management, is the uncertainty-related negative consequence. Improved data is the way to improve AI–ML models. With uncertain water resource data sets, like stream discharge, there is no quick way to generate improved data. Data assimilation (DA) provides mitigation for uncertainty risks, describes data- and model-related uncertainty, and propagates uncertainty to results using observation error models. A DA-derived mitigation example is provided using a common-sense baseline, derived from an observation error model, for the confirmation of generalization ability and a threshold identifying overfitting. AI–ML models can also be incorporated into DA to provide additional observations for assimilation or as a forward model for prediction and inverse-style calibration or training. The mitigation of uncertain data risks using DA involves a modified bias–variance tradeoff that focuses on increasing solution variability at the expense of increased model bias. Increased variability portrays data and model uncertainty. Uncertainty propagation produces an ensemble of models and a range of predictions.
Keywords: uncertainty; bias–variance tradeoff; overfitting; AI–ML; data assimilation (DA); observation error model; PEST++; ensemble methods; deep learning (DL) uncertainty; bias–variance tradeoff; overfitting; AI–ML; data assimilation (DA); observation error model; PEST++; ensemble methods; deep learning (DL)

Graphical Abstract

Share and Cite

MDPI and ACS Style

Martin, N.; White, J. Water Resources’ AI–ML Data Uncertainty Risk and Mitigation Using Data Assimilation. Water 2024, 16, 2758. https://doi.org/10.3390/w16192758

AMA Style

Martin N, White J. Water Resources’ AI–ML Data Uncertainty Risk and Mitigation Using Data Assimilation. Water. 2024; 16(19):2758. https://doi.org/10.3390/w16192758

Chicago/Turabian Style

Martin, Nick, and Jeremy White. 2024. "Water Resources’ AI–ML Data Uncertainty Risk and Mitigation Using Data Assimilation" Water 16, no. 19: 2758. https://doi.org/10.3390/w16192758

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop