# **Soft Computing and Machine Learning in Dam Engineering**

Edited by

M. Amin Hariri-Ardebili, Fernando Salazar, Farhad Pourkamali-Anaraki, Guido Mazzà and Juan Mata Printed Edition of the Special Issue Published in *Water*

www.mdpi.com/journal/water

## **Soft Computing and Machine Learning in Dam Engineering**

## **Soft Computing and Machine Learning in Dam Engineering**

Editors

**M. Amin Hariri-Ardebili Fernando Salazar Farhad Pourkamali-Anaraki Guido Mazz `a Juan Mata**

MDPI • Basel • Beijing • Wuhan • Barcelona • Belgrade • Manchester • Tokyo • Cluj • Tianjin


Farhad Pourkamali-Anaraki Department of Mathematical and Statistical Science University of Colorado Denver United States

*Editorial Office* MDPI St. Alban-Anlage 66 4052 Basel, Switzerland

This is a reprint of articles from the Special Issue published online in the open access journal *Water* (ISSN 2073-4441) (available at: www.mdpi.com/journal/water/special issues/SCML).

For citation purposes, cite each article independently as indicated on the article page online and as indicated below:

LastName, A.A.; LastName, B.B.; LastName, C.C. Article Title. *Journal Name* **Year**, *Volume Number*, Page Range.

**ISBN 978-3-0365-7579-7 (Hbk) ISBN 978-3-0365-7578-0 (PDF)**

Cover image courtesy of Mohammad Amin Hariri-Ardebili

© 2023 by the authors. Articles in this book are Open Access and distributed under the Creative Commons Attribution (CC BY) license, which allows users to download, copy and build upon published articles, as long as the author and publisher are properly credited, which ensures maximum dissemination and a wider impact of our publications.

The book as a whole is distributed by MDPI under the terms and conditions of the Creative Commons license CC BY-NC-ND.

## **Contents**


## **Preface to "Soft Computing and Machine Learning in Dam Engineering"**

On behalf of the International Commission on Large Dams (ICOLD), I am honored to provide an opening dialog for this Special Issue of *Water*, "Soft Computing and Machine Learning in Dam Engineering". I have been fortunate to be part of the ICOLD for more than 20 years, which is an international organization founded on and committed to the premise that society is best served when nations communicate and collaborate for the safety and service of dams and levees. For almost 100 years, the ICOLD has encouraged collaborations over political division, as it now has 104 member nations with more than 15,000 individual members from around the world. The ICOLD has relentlessly dedicated itself as an organization to working across geographical and political boundaries to support individuals and nations by fostering a commitment to mutual support and collaboration for safety in dam engineering.

Water and dams have provided many challenges to engineers throughout the ages, demonstrating great success as critical infrastructure for providing sustenance and improving quality of life for billions of people around the globe. Unfortunately, there have also been great failures of dams and levees due to engineering, operational, or other unfortunate sequences of realized failure modes, many of which can be traced to deficient learning and training. In many situations, education and training can be a key differentiator between success or failure in dam safety, with precious lives and significant investments weighing in the balance as owners and regulators look to protect lives and investments.

Dam safety is the common theme and focus of many organizations, conferences, groups, societies, and individuals around the world. In the best cases, dam safety by intentional design that considers future functions and environmental conditions provides engineers with the critical insights into lifecycle influences of change that envelop risk as projects progress through planning, design, construction, and operation. As engineers, we must always be open to learning from the lessons of the past and to be educated on new technologies of the future.

The dam engineering industry has seen the growth of risk-informed decision making (RIDM) in the last few decades as engineers and owners have learned to pivot their focus of design based on historically developed guidelines to consider modes and sequences of events at dams that could potentially lead to failure. The future of RIDM combined with machine learning offers an intriguing path for engineers to consider as we combine the power of computing machines with the critical thinking of the human mind to work towards a better understanding of risk and the elimination of catastrophic dam failures.

I am thankful for MDPI's journal *Water*for its initiative in preparing this Special Issue focusing on the important technological areas of "Soft Computing and Machine Learning in Dam Engineering". The editors and authors of this collection have provided great insights into the growing field of dam engineering, which is of great importance to all of us as engineers and scientists that focus on dams. This Special Issue highlights the emerging thinking of these authors in the legacy field of civil engineering for dams. The technology of computer science is evolving in its applications from traditional numerical methods to computational techniques that mimic human-like problem-solving behavior, which will significantly change our industry.

Soft computing techniques have been developing in several different areas of engineering applications for many years. The application of these technologies in civil engineering comes with a vast potential number of applications for dams and levee projects. The papers in this Special Issue address the development and evolution of these new learning techniques for engineers and their computational machines. The findings outlined provide important information for dam engineers and scientists around the world. I am hopeful that this Special Issue leads to additional creative development and critical thinking by man and machine on the topics provided herein.

Dam safety must continue to be an emotional driver for engineers in our profession, and continual learning and training is the hallmark of our commitment to safety. *Water* and organizations such as the ICOLD represent the global comradery of the engineering profession regarding education and training on dams and levees. On behalf of the ICOLD, I am thankful for this opportunity to develop a partnership with MDPI to support and recognize "Soft Computing and Machine Learning in Dam Engineering", which provides a glimpse into the future of dam engineering.

By Michael F. Rogers, Honorary President, International Commission of Large Dams.

## **M. Amin Hariri-Ardebili, Fernando Salazar, Farhad Pourkamali-Anaraki, Guido Mazz`a, and Juan Mata**

*Editors*

## *Editorial* **Soft Computing and Machine Learning in Dam Engineering**

**Mohammad Amin Hariri-Ardebili 1,2,\* , Fernando Salazar <sup>3</sup> , Farhad Pourkamali-Anaraki <sup>4</sup> , Guido Mazzà <sup>5</sup> and Juan Mata <sup>6</sup>**


#### **1. Introduction and Overview**

Dams have played a vital role in human civilization for thousands of years, providing vital resources such as water and electricity, and performing important functions such as flood control. The scale and complexity of dam projects have increased in recent years, making their safety evaluation even more challenging. Therefore, it is crucial that dam engineers consider all potential risks and take appropriate measures to ensure the safety and stability of these structures [1]. The nature and existence of dams are highly coupled with concepts such as population growth, climate change, global warming, and water security [2]. According to the International Commission on Large Dams's (ICOLD) [3] most recent update in April 2020, there are about 58,700 registered large dams in the world. Figure 1 illustrates the global distribution of these large dams.

**Figure 1.** Global distribution of large dams as of 2020.

Traditional dam safety methods, based on visual inspections and manual monitoring, have long been the standard for ensuring the stability and safety of dams. However, as the scale and complexity of dam projects have increased, these methods have become increasingly insufficient. Major limitations of traditional dam safety methods are the existence of deficient observation plans and the potential for human error. Inspectors may miss crucial signs of deterioration or failure, and manual monitoring can be prone to

**Citation:** Hariri-Ardebili, M.A.; Salazar, F.; Pourkamali-Anaraki, F.; Mazzà, G.; Mata, J. Soft Computing and Machine Learning in Dam Engineering. *Water* **2023**, *15*, 917. https://doi.org/10.3390/w15050917

Received: 31 January 2023 Accepted: 7 February 2023 Published: 27 February 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

inaccuracies. In addition, as the number of (aged and new) dams continues to increase, it becomes increasingly difficult and resource-intensive to manually inspect and monitor each one. Another limitation of traditional dam safety methods is that they are typically reactive rather than proactive. They focus on identifying and addressing problems after they have already occurred, rather than predicting and preventing them.

In contrast, modern techniques such as remote sensing, drones, and sensor networks can provide more accurate, real-time data on dam conditions. They can also be used to continuously monitor dams, providing an early warning of potential problems. Artificial Intelligence (AI) can be applied to the data collected from these modern techniques for identifying patterns and anomalies that may indicate a potential problem. AI algorithms can be used in the decision-making process for dam safety by providing accurate and updated risk analysis.

#### **2. Soft Computing in Dam Engineering**

Soft computing is a collection of techniques in computer science that aim to provide solutions to problems that are difficult or impossible to solve using traditional, "hard" methods of computation. Soft computing encompasses various computational techniques that are designed to replicate human-like problem-solving behavior. It includes a variety of techniques such as fuzzy logic, neural networks, genetic algorithms, and probabilistic reasoning, which are used to solve problems that are too complicated for traditional, rulebased approaches [4]. Soft computing has a wide range of applications in engineering, including control systems, signal processing, pattern recognition, and optimization. For example, in signal processing, neural networks and genetic algorithms can be used to improve the accuracy of signal classification and feature extraction. In optimization, genetic algorithms and probabilistic reasoning can be used to solve complex optimization problems.

The use of soft computing techniques, such as fuzzy logic and neural networks, in dam engineering began to gain popularity in the late 1990s and early 2000s. The first application of these techniques for modeling dam behavior is arguably the work by Bossoney [5], closely followed by Hattingh L.C. [6]. The main purpose was to overcome the limitations of the traditional Hydrostatic-Season-Time (HST) model [7] in terms of the identification of nonlinear behavior and consideration of complex phenomena. This has been the main application of soft computing in dam engineering to date, favored by the development of new algorithms and the increase in available monitoring data due to the installation of automatic data acquisition systems (ADAS). In this line, the ICOLD Benchmark Workshop held in 2001 was a milestone, since for the first time solutions were presented with methods such as K-nearest neighbors [8] or nonlinear autoregressive exogenous models (NARX) [9].

Later on, and in parallel with the aforementioned application, soft computing began to be used as a surrogate for finite element models in analyses with high computational costs. The typical example is the study of the probability of failure with Monte Carlo-type methods, which requires executing a extremely high number of numerical simulations [10]. More recently, the use of these techniques in dam engineering has grown significantly, as specific libraries have become available in different programming languages, which are easier to implement and apply [11–14]. Today, soft computing is used in a variety of applications related to dam engineering, such as:


With all the above-discussed advantages, the soft computing methods have several limitations in dam engineering problems. Machine learning models require a large amount of data to be trained and tested, which can be a limitation in dam engineering, where data collection can be difficult and expensive. Data availability can be the main obstacle to the application of machine learning models in numerical simulation of large-scale dams [28]. The quality of the data used to train and test machine learning models is crucial for the accuracy and reliability of the models. Data in dam engineering can be noisy, incomplete (e.g., missing sensor) or biased, which can negatively impact the performance of the models [29]. Complex machine learning models can be difficult to interpret and understand [30], which can be a limitation when trying to explain the results of the models to non-experts. It is important to validate the models using independent data sets, but this can be difficult and time consuming in dam engineering. Dam systems are dynamic and subject to change. Machine learning models may not be able to adapt to changing conditions, which could lead to poor predictions. Dam engineering is a critical field that affects the safety of human lives, property and the environment. Therefore, the reliability and safety of the machine learning models used in dam engineering need to be carefully evaluated.

The challenges are increasingly complex. The differences among dam owners in terms of financial and human resources is a crucial aspect to consider. The investment is different between the dam owners and between the dams since new surveillance in old dams are many times more difficult to carry out than in new dams. Adopting adequate monitoring plans is mandatory, as is the promotion of surveillance activities by expert engineers. The constitution of multidisciplinary teams and the exploitation of the possibilities of soft computing and machine learning techniques are essential to adequately respond to dam surveillance activities' needs. Sharing knowledge between scientists and practitioners is also a key element for improving surveillance activities.

#### **3. About the Special Issue**

In May 2020, a team of guest editors specialized in different aspects of dam engineering and machine learning proposed to launch a Special Issue "Soft Computing and Machine Learning in Dam Engineering " to the journal of "*Water*". This Special Issue aimed to capture the recent increase in research activity at the interface of dam engineering and machine learning methods.

In this Special Issue, we solicited high-quality original research articles focused on state-of-the-art techniques and methods employed in the design and analysis of dams. We welcomed both theoretical and application papers of high technical standards across various disciplines, thus facilitating an awareness of techniques and methods in one area that may apply to other areas.

This book includes ten contributions to this Special Issue published between 2020 and 2023. The acceptance rate was less than 50%, which is an acceptable rate for a technical Special Issue, where nearly all the submissions were by invitation. The overall aim of the collection is to improve our understanding from applications of soft computing in dam engineering including its challenges.

Figure 2 shows a "word cloud" data-mined from all accepted papers, indicating repetition of relevant keywords. The accepted papers cover a wide range of dam engineeringrelated topics. In a very broad classification, one may identify the following major categories: (1) probabilistic simulations, (2) risk-based methods, (3) stochastic input motion, (4) uncertainty quantification, and (5) applied machine learning including validation.

**Figure 2.** Word cloud from all the accepted papers in this Special Issue journal. **Figure**  Word cloud from all the accepted papers in this Special Issue journal.

#### **4. Contributions to Current Special Issue 4. Contributions to Current Special Issue**

Deterministic analysis never provides a comprehensive performance assessment of structural systems. Therefore, probabilistic methods have emerged as a promising alternative. However, such methods, in addition to being computationally expensive, can produce very different solutions, depending on the input parameters, which can greatly influence the decision making. The importance of probabilistic methods in dam engineering has been discussed previously in [31–33]. In the paper by Segura et al. [34], "Accounting for Uncertainties in the Safety Assessment of Concrete Gravity Dams: A Probabilistic Approach with Sample Optimization", the authors proposed a probabilistic-based methodology for assessing the safety of dams under usual, unusual, and extreme loading conditions. This allows the analysis to be updated while avoiding unnecessary simulation runs by classifying the load cases according to the annual probability of exceedance and by using an efficient progressive sampling strategy. They also conducted a variance-based global sensitivity analysis to identify the most influencing parameters affecting the dam response. Deterministic analysis never provides a comprehensive performance assessment of structural Therefore, probabilistic methods have emerged as a promising alternative. However, such methods, in addition to being computationally expensive, can produce solutions, depending on the input parameters, which can greatly influence the decision making. The importance of probabilistic methods in dam engineering has been discussed previously in [31–33]. In the paper by Segura et al. [34], "Accounting for Uncertainties in the Safety Assessment of Concrete Gravity Dams: A Probabilistic Approach Optimization", the authors proposed a probabilistic-based methodology for assessing the safety of dams under usual, unusual, and extreme loading conditions. This allows the analysis to be updated while avoiding unnecessary simulation runs by classifying the load cases according to the annual probability of exceedance and by using an efficient progressive sampling strategy. They also conducted a variance-based global sensitivity analysis to identify the most influencing parameters affecting the dam response.

While the probabilistic simulations can be used to extract the structural capacity of dams, it is not always possible to perform hundreds of simulations for highly nonlinear systems. The alternative is to use the approximate methods such as endurance time analysis (ETA) [35]. ETA is a dynamic pushover procedure that evaluates the structural performance of a system from the linear to nonlinear range using single simulations. In a paper by Alegre et al. [36], "Seismic Safety Assessment of Arch Dams Using an ETA-Based Method with Control of Tensile and Compressive Damage", the authors present an ETAbased method for seismic safety assessments of arch dams using tensile and compressive damage models. The seismic performance is evaluated by controlling the evolution of the damage state of the dam, according to predefined performance criteria, to estimate acceleration endurance limits for tensile and compressive damage. They evaluated the dam response at two seismic hazard levels, i.e., Operating Basis Earthquake and Safety Evaluation Earthquake. While the probabilistic simulations can be used to extract the structural capacity of dams, it is not always possible to perform hundreds of simulations for highly nonlinear systems. The alternative is to use the approximate methods such as endurance time analysis (ETA) [35]. ETA is a dynamic pushover procedure that evaluates the structural performance of a system from the linear to nonlinear range using single simulations. In a paper by Alegre et al. [36], "Seismic Safety Assessment of Arch Dams Using an ETA-Based Method with Control of Tensile and Compressive Damage", the authors present an ETAbased method for seismic safety assessments of arch dams using tensile and compressive damage models. The seismic performance is evaluated by controlling the evolution of the damage state of the dam, according to predefined performance criteria, to estimate acceleration endurance limits for tensile and compressive damage. They evaluated the dam response at two seismic hazard levels, i.e., Operating Basis Earthquake and Safety Evaluation Earthquake.

The probabilistic seismic performance of dams typically results in fragility curves or capacity functions that are useful for the engineers [33,37,38]. However, they do not provide a direct connection between the failure probability at different seismic hazard The probabilistic seismic performance of dams typically results in fragility curves or capacity functions that are useful for the engineers [33,37,38]. However, they do not provide a direct connection between the failure probability at different seismic hazard levels and the associated risk for downstream population and properties. In a paper by Ferguson [39], "Risk-Informed Design of RCC Dams under Extreme Seismic Loading", the author proposed a practical framework for risk-informed design of concrete dams. The results of 2D and 3D numerical simulations were used to drive the risk metrics and feasibility level design.

The machine learning techniques can be used to process the results of numerical simulations. They can be used for both "analysis" and "design" purposes. In a paper by Shahzadi and Soulaïmani [40], "Deep Neural Network and Polynomial Chaos Expansion-Based Surrogate Models for Sensitivity and Uncertainty Propagation: An Application to a Rockfill Dam", the authors used two machine learning techniques to build a surrogate model of a rockfill dam considering the uncertainties in constitutive soil parameters. Furthermore, they found that shear modulus and the Poisson coefficient are the parameters that play the most significant role in the dam's behavior.

In a separate study by Hariri-Ardebili and Pourkamali-Anaraki [41], "An Automated Machine Learning Engine with Inverse Analysis for Seismic Design of Dams", the authors used automated machine learning (AutoML) for the design of new dams. They first developed a large database of about 24,000 simulations in which the uncertainties associated with shape, material properties, water level, and ground motion records are incorporated. Next, AutoML is used to generate a surrogate model of dam response as a function of input variables. A simple yet robust inverse analysis method is coupled with a multi-output surrogate model to design the new dams in which only part of the data are available. The design shape from the inverse analysis is in good agreement with the design objectives and also the finite element simulations.

Aside from the application of soft computing methods in regression and classification of data, they can be used for the sensitivity analysis of dams too. In a paper by Hariri-Ardebili et al. [42], "An RF-PCE Hybrid Surrogate Model for Sensitivity Analysis of Dams", the authors proposed two techniques for the sensitivity assessment of concrete dams with heterogeneous concrete, i.e., a polynomial chaos expansion and random forest. They used these techniques to identify the areas of dam in which the variation of material properties have the highest impact on the vibration response. Their findings can improve the process of system identification for old dams.

Another complex aspect, which is seldom analyzed, is the effect of ice loads on dam displacements. This, which obviously affects dams in cold regions, was studied with an innovative approach in a paper by Hellgren et al. [43], "Estimating the Ice Loads on Concrete Dams Based on Their Structural Response". The authors estimated the magnitude of ice loads on five dams in Sweden, four concrete buttress dams and one arch dam. The results suggested that the estimates of ice loads from measurement sensors and from design guidelines are over-conservative.

Soft computing techniques are also capable of jointly analyzing a set of monitoring records. In a paper by Salazar et al. [44] "Anomaly Detection in Dam Behaviour with Machine Learning Classification Models", machine learning classifiers based on support vector machines and random forests are tested for detecting anomalies in a double-curvature arch dam. Results show the potential of this approach as a robust procedure for novelty detection. The main limitation, also identified, is the need for high-quality monitoring data.

The dissemination of machine learning techniques has promoted the development of new models for dam behavior prediction. However, validating these models based on the dam engineer's knowledge is fundamental for their adequate use. This issue is tackled in the paper by Mata et al. [45] "Validation of Machine Learning Models for Structural Dam Behaviour Interpretation and Prediction". The authors present a methodology based on several validation techniques, including historical data validation, sensitivity analysis, and predictive validation for the practical application of data-based models for structural dam behavior prediction in daily dam surveillance activities.

Finally, in a paper by Mata et al. [46] "Characterization of Relative Movements between Blocks Observed in a Concrete Dam and Definition of Thresholds for Novelty Identification Based on Machine Learning Models", the authors present a methodology for the earlier detection of novelties through the analysis of the residuals of prediction models, taking into account the evolution of the records over time and the simultaneity of the structural responses measured in a concrete dam, namely through the threshold definition based on a singular record, a moving time period, and multivariate records.

#### **5. Future Research Directions**

The dissemination and implementation of the scientific and technological advances achieved in the dam surveillance area during the last decade are not yet effective for most dam owners, even for large dams. The trend towards the installation and use of ADAS, recommended in monitoring plans of new large dams, is a great opportunity to assess safety conditions in real time, but it requires tools to process big data sets. Machine learning and deep learning provide dam safety engineers with essential functionalities for an efficient and effective enhancement of these tools, in order to adequately satisfy the needs resulting from dam surveillance activities. Some of the future advances in this area to fill existing gaps may include the development of methodologies and tools for:


We hope that this Special Issue would shed light on the recent advances and developments in the area of soft computing and dam engineering, and attract attention by the scientific community to pursue further research and studies on simulation and modeling of dams and appurtenant structures.

**Acknowledgments:** We would like to express our appreciation to all authors for their informative contributions and the reviewers for their support and constructive critiques that made this special journal issue possible.

**Conflicts of Interest:** The authors declare no conflict of interest.

**Disclaimer:** The views, opinions, and strategies expressed by the authors are theirs alone, and do not necessarily reflect the views, opinions, and strategies of their affiliated universities, organizations and committees.

#### **References**


**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

**Mohammad Amin Hariri-Ardebili 1,2,\* , Golsa Mahdavi <sup>2</sup> , Azam Abdollahi <sup>3</sup> and Ali Amini <sup>4</sup>**


**Abstract:** Quantification of structural vibration characteristics is an essential task prior to perform any dynamic health monitoring and system identification. Anatomy of vibration in concrete arch dams (especially tall dams with un-symmetry shape) is very complicated and requires special techniques to solve the eigenvalue problem. The situation becomes even more complicated if the material distribution is assumed to be heterogeneous within the dam body (as opposed to conventional isotropic homogeneous relationship). This paper proposes a hybrid Random Field (RF)–Polynomial Chaos Expansion (PCE) surrogate model for uncertainty quantification and sensitivity assessment of dams. For different vibration modes, the most sensitive spatial locations within dam body are identified using both Sobol's indices and correlation rank methods. Results of the proposed hybrid model is further validated using the classical random forest regression method. The outcome of this study can improve the results of system identification and dynamic analysis by properly determining the vibration characteristics.

**Keywords:** dams; Polynomial Chaos Expansion; random fields; random forest; vibration analysis

#### **1. Introduction**

Determination of the vibration characteristics in concrete dams (especially the unsymmetrical tall arch dams) is an essential task prior to perform any nonlinear seismic analysis [1], dynamic health monitoring [2,3], and system identification [4,5]. This research topic has received a lot of recognition in the past few years [6] and several advanced techniques have been developed such the one by Sevieri and De Falco [7] based on Bayesian interface model. Among many assumptions to formulate the numerical model, the most fundamental one is isotropic homogeneous concrete (or at least zoned concrete) material distribution within the dam body [8]. While this assumption can be valid for new dams or those without (visible or hidden) physical damage, it cannot be used widely for damaged or aged dams. In fact, several recent research shows that macro-scale heterogeneity [9] might have a great influence on the dynamic characteristics of large concrete structures (i.e., concrete dams [10] and nuclear containment buildings [11]).

The main objective in this paper is to solve the heterogeneous vibration analysis and dynamic identification of arch dams using surrogate modeling. Polynomial chaos expansion (PCE) is used in conjunction with Latin Hypercube sampling (LHS) and random fields (RF) theory. The hybrid RF-PCE surrogate model is then used to identify the most sensitive regions of two (symmetry and un-symmetry) arch dams at different vibration modes. Such a hybrid meta-model has not been used before for dams in any level. The findings of this study will help to identify the locations of dams (and any other infrastructure) which most contribute to various vibration frequencies. This will be later used

**Citation:** Hariri-Ardebili, M.A.; Mahdavi, G.; Abdollahi, A.; Amini, A. An RF-PCE Hybrid Surrogate Model for Sensitivity Analysis of Dams. *Water* **2021**, *13*, 302. https://doi.org/10.3390/w13030302

Academic Editor: Zhi-jun Dai Received: 28 December 2020 Accepted: 23 January 2021 Published: 26 January 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

to perform a successful transient analysis for large-scale heterogeneous structures and can greatly improve the model validation and verification.

First, a comprehensive literature review is provided in Section 1.1, followed by a brief review on theory of random fields, polynomial chaos expansion and introducing the hybrid model in Section 2. The case study dams are explained in Section 3, and the results are discussed in Section 4. Finally, the PCE-based sensitivity analysis is validated by classical random forest method in Section 5.

#### *1.1. Literature Review*

This section provides a literature review on the application of PCE meta-model for dams and some other infra-structures.

Guo et al. [12] investigated the reliability analysis of an embankment dam by sparse PCE, as well as identification of the most important random variables (RVs) by Sobol indices. Two different deterministic approaches were compared to probabilistic method. The results showed that deterministic approaches lead to similar reliability results in terms of the failure probability, *P<sup>f</sup>* , distribution of the first order second moment (FOSM), and sensitivity index for each RV. They reported that soil dry density, effective cohesion, and friction angle have the most dominant effect on the sliding stability assessment of the embankment dam. Sevieri et al. [13] proposed a framework for post-earthquake safety assessment of the existing dams. The Bayesian updating technique was applied to reduce the uncertainties related to modeling procedure, material properties and geometry definition of gravity dams. The finite element (FE) model response related to water level variation was approximated via PCE method, and used for calibration. Hariri-Ardebili and Sudret [14] investigated the application of PCE meta-model in uncertainty quantification (UQ) of different implicit and explicit dam engineering problems. They found that PCE can develop a surrogate model with a very limited number of initial simulations and reduce the computational time considerably. Even the sample size as low as 5% of the initial database can provide an acceptable engineering result. They performed a comprehensive parametric study on the impact of different sample sizes, quantity of interests (QoIs), statistical parameters (e.g., mean, variance, error), sampling methods (e.g., LHS, Halton, Sobol), and spatial variability.

Aside from dam engineering problems, the PCE has been practiced in some other engineering fields, too. Kim [15] performed a comparative study on the capability of meta-models in UQ of building energy model. To bypass the inherent high computational cost of building performance simulation tools, PCE method along with Gaussian process emulator (GPE) were applied. The results on total energy consumption of a real case study demonstrated that both meta-models can be used as an alternative to crude Monte Carlo simulation (MCS). They reported that GPE lead to more reliable outputs than PCE on which number of sampling is increased. Wei et al. [16] implemented PCE-based UQ of mechanical properties in dynamic analysis of fiber reinforced polymer (FRP). Longitudinal elastic, transverse elastic, shear, and Young's modulus were considered as RVs, while the QoIs were assumed to be natural frequencies, modal mass and absolute peak acceleration at the mid mid-span of the bridge. They found the RVs have more impact of natural frequencies compared to modal mass. Slot et al. [17] applied PCE and Kriging metamodels to perform fatigue reliability assessment of a wind turbine with wind direction, wind speed, turbulence, wind shear, air density, and flow inclination as RV. Overall, they reported that Kriging meta-model leads to more accurate results compared to PCE for the studied turbine. Ni et al. [18] performed a PCE-aided UQ of bridge structures including: eigenvalue analysis, dynamic analysis of a linear bridge-vehicle system, and probabilistic nonlinear seismic structural analysis of a bridge. Surrogate-assisted results were compared and contrasted with crude MCS and FOSM approaches. It has been shown that PCEassisted UQ demonstrates a better performance in terms of accuracy and efficiency. As an application of using meta-models in the context of nuclear engineering, Bouhjiti et al. [19] applied PCE method to develop a global stochastic FE method to predict the uncertainty

propagation in a thermo-hydro-mechanical leakage (THM-L) model. It is a framework to simulate the behavior of large reinforced and pre-stressed structures under uncertainties due to material and loading. The PCE-assisted UQ reduces the RVs and facilitates the probabilistic representation of the THM-L model through a cost-effective reliability analysis.

In connection with random fields, Dubreuil et al. [20] developed an adaptive RF representation dedicated to the approximation of extreme value statistics. The proposed approach was based on a discretization of RF by hybrid PCE-Kriging approaches and could provide an accurate presentation of the global extremum for any realization. This framework benefits from partial least square and sparse PCE algorithms in the Kriging and PCE meta-models, respectively. It was concluded that hybrid application of PCE and Kriging methods could play an important role in terms of RF discretization. Guo et al. [21] performed a RF-RV reliability analysis of an earth dam in terms of sliding stability considering the soil uncertainties. Performance assessment of different reliability approaches such as crude MCS, subset simulation (SS), moment method (MM), hybrid sparse PCEglobal sensitivity analysis (SPCE/GSA), and coupled sparse PCE-sliced inverse regression (SPCE/SIR), were investigated. Results indicated that the most accurate and efficient methods are SPCE/GSA and SS. In addition, they reported that efficiency of surrogate-aided approaches mainly relies on the number of input RVs. Then, in a SPCE/GSA-based stability study, Guo et al. [22] focused on the effect of cross-correlation dependency between effective cohesion and friction angle considering RF representation of soil parameters via Karhunen-Loève expansion. QoIs were assumed be PDF, model response statistical parameters, failure probability and Sobol index of each RV. It has been concluded that applying spatial variability for embankment dams leads to lower *P<sup>f</sup>* estimation and disperse safety factor values. Furthermore, the *P<sup>f</sup>* is increased with increasing the auto-correlation distance, cross-correlation and coefficient of variation (COV). Mishra et al. [23] proposed a PCE-assisted reliability based life-cycle management framework for buried pipelines under time-variant corrosion damage. After developing the PCE meta-model, an optimization algorithm was proposed to address the life-cycle management of buried pipelines. They concluded that PCE-based RF development facilitates the probabilistic configuration and temporal correlation of corrosion depth.

#### **2. Background Theory**

#### *2.1. Polynomial Chaos Expansion*

The computational model of an structural system can be represented by M as a function of *M* input parameters which are modeled by a *M* dimensional *X* = {*X*1, *X*2, ..., *XM*}, and marginal probability density functions *fX<sup>i</sup>* , *i* = 1, ..., *M*. The scalar QoI resulted from this system is a RV, denoted *Y* = M(*X*), and can be represented as a PCE [24] with limited to a finite sum:

$$\mathcal{Y}^{\rm PCE} = \mathcal{M}^{\rm PCE}(\mathbf{X}) = \sum\_{\mathfrak{a} \in \mathcal{A}} y\_{\mathfrak{a}} \Psi\_{\mathfrak{a}}(\mathbf{X}),\tag{1}$$

where Ψ*α*(*X*) are the multi-variate polynomials orthonormal with respect to *fX*, *y<sup>α</sup>* are the expansion coefficients, *α* are multi-indices that identify the components of the multi-variate polynomials, and A is the truncation set of multi-indices of cardinality *P*. This format of PCE is typically used for simulation-based uncertainty quantification problems.

There are two main truncation schemes, i.e., standard [25] and hyperbolic [26]. The hyperbolic truncation is a modification to the standard version (which is based on selection of all polynomials in *M* input RVs of total degree not exceeding *p*), and uses the parametric *q*-norm to define the truncation (for *q* = 1, the hyperbolic truncation is identical to the standard one):

$$\mathcal{A}^{M,p,q} = \left\{ \mathfrak{a} \in \mathcal{A}^{M,p} \;:\; \|\mathfrak{a}\|\_{q} \le p \right\}, \qquad \|\mathfrak{a}\|\_{q} = \left(\sum\_{i=1}^{M} \mathfrak{a}\_{i}^{q}\right)^{1/q}.\tag{2}$$

The main objective in PCE framework is to determine the expansion coefficients [14]. In this paper, a non-intrusive approach is adopted which relies on post-processing the outputs resulted from simulation-based methods [27]. According to the least angle regression (LAR), only the polynomials with large impact on the QoIs are retained, while the other are wiped. The LAR method is based on determination of coefficients *y* to minimize the mean square error (MSE) including a penalty term of the form *λ*k*y*k<sup>1</sup> as discussed by Efron et al. [28]:

$$\hat{y} = \operatorname\*{arg\,min}\_{y \in \mathbb{R}^P} \mathbb{E}\left[ \left( \mathbf{y}^T \mathbf{Y}(\mathbf{X}) - \mathcal{M}(\mathbf{X}) \right)^2 \right] + \lambda \|\mathbf{y}\|\_{1\prime} \tag{3}$$

where k*y*ˆk<sup>1</sup> = ∑*<sup>α</sup>* |*yα*| is the regularization term that forces the minimization to favor sparse solutions [26].

In the LAR algorithm, first initialize the parameters, *y<sup>α</sup>* = 0; candidate set of Ψ*α*, active set of 0, and set the residuals equal to the vector of observations *y*. Second, find the vector Ψ*α<sup>j</sup>* which is most correlated with the current residual. Move *y<sup>α</sup>* from zero towards their least-square value until their regressors Ψ*α<sup>j</sup>* are equally correlated to the residual as some other regressor in the candidate set. Next, compute the leave-one-out (LOO) error, *Err<sup>j</sup>* LOO for the current iteration, update all the active coefficients, and move Ψ*α<sup>j</sup>* from candidate set to the active set. Finally, continue this step until the size of the active step becomes min(*N* − 1, *P*). The LOO cross validation technique is intended to overcome the over-fitting [29] based on the following relationship:

$$Err\_{\text{LOO}} = \frac{\sum\_{i=1}^{N} \left( \frac{\mathcal{M}\left(\mathbf{x}^{(i)}\right) - \mathcal{M}^{P\to}\left(\mathbf{x}^{(i)}\right)}{1 - \text{diag}\left(\mathbf{A}\left(\mathbf{x}^{T}\mathbf{A}\right)^{-1}\mathbf{A}^{T}\right)}\right)^{2}}{\sum\_{i=1}^{N} \left(\mathcal{M}\left(\mathbf{x}^{(i)}\right) - \frac{1}{N}\sum\_{i=1}^{N} \mathcal{M}\left(\mathbf{x}^{(i)}\right)\right)^{2}}\tag{4}$$

where *AN*×*<sup>P</sup>* is the information matrix, which contains evaluations of all base polynomials at all points of the DOE.

#### *2.2. Random Fields*

In structural engineering, the random fields are typically used for characterizing the spatial variability in the material properties or modeling in various micro to macro-level heterogeneity [11]. Different random fields are formulated based on the nature of the uncertainty within the studied stochastic environment [30]. Detailed formulation of the random fields and their validation is beyond this paper and only the fundamental equations are provided (details can be found in Reference [31]). A random field is formulated in a general form of:

$$H(\mathbf{x}) = \mu(\mathbf{x}) + \eta(\mathbf{x}),\tag{5}$$

where *µ*, *η*, and **x** are mean function, random function (with zero mean and an autocovariance of *Caa*), and position vector, respectively. The auto-covariance function is defined as:

$$\mathbb{C}\_{aa}(\xi, \sigma\_0) = \sigma\_0^2 \, \rho\_{aa}(\xi) \,. \tag{6}$$

where *ξ* = |**x** − **x** 0 | is the distance between any two arbitrary points, *σ*<sup>0</sup> is standard deviation, and *ρaa*(*ξ*) refers to the auto-correlation function, and is defined as:

$$\rho\_{aa}(\xi) = \exp\left(-\left(\frac{\xi}{l\_{corr}}\right)^2\right),\tag{7}$$

where *lcorr* is the correlation length.

In this paper, the random fields generation is based on covariance matrix decomposition. In addition, a simple midpoint discretization technique [32] is used to transfer the continuous nature of random fields into the finite element mesh. A sample 2D random fields is shown in Figure 1 including its finite element discretization with three types of

meshes: coarse, medium, and non-uniform medium-fine. The non-uniform discretization is used in cases where the local response is sensitive to the property distribution in that area (e.g., local fracture). There is a direct relation between the correlation length, *lcorr*, and the size of the optimal mesh. The smaller the *lcorr*, the finer the mesh should be to properly capture the random fields transition. The number of realizations/samples is typically optimized using the LHS technique [33].

**Figure 1.** Sample random fields generation and its finite element discretization.

#### *2.3. Lanczos Algorithm for Eigenvalue Problems*

Since the hybrid RF-PCE is applied on the frequency response of the coupled system, a brief review is provided for the readers interested in direct implementation. Frequency analysis is performed beginning with a free vibration linear equation of motion and neglecting the damping term as **M**{ **¨u**} + **K**{**u**} = **0**, where **M** and **K** are global mass and stiffness matrices, and {**u**} is the displacement vector. Let us assume a harmonic motion for the displacement, **u** = {*φ*}*<sup>i</sup>* sin(*ω<sup>i</sup> t* + *θi*), where {*φ*}*<sup>i</sup>* and *ω*<sup>2</sup> *i* = *λ<sup>i</sup>* are eigenvectors and eigenvalues, respectively. Substituting **u** into the equation of motion leads to the following non-trivial solution: det(**K** − *λi***M**) = **0**. This is a so-called eigenvalue problem that may be solved for up to *n* eigenvalues, where *n* is the number of degrees of freedoms (DOF). The number of unknowns is one more than the number of equations; therefore, an additional equation is needed to solve the problem. Two methods are available to provide such an additional equation, namely: (1) mode shape normalization to mass matrix, {*φ*} *T <sup>i</sup>* **M**{*φ*}*<sup>i</sup>* = **1**; and (2) mode shape normalization to unity, where the largest vector component {*φ*}*<sup>i</sup>* is set to a value of one.

Many techniques can be used to solve this equation, some of which are listed in Reference [34]. The direct block Lanczos algorithm is a computationally efficient and easy-to-implement technique for solving eigenvalue problems. Originally proposed by Lanczos [35] and based on an extension of the power method, this technique was numerically unstable and wound up being modified by Newman and Ojalvo [36] using a method for purifying vectors in order to stabilize the solution. A detailed theoretical formulation of Lanczos' algorithm can be found in Reference [37,38].

Given a square matrix **S**, the values of {*φ*}*<sup>i</sup>* (eigenvector) and *λ<sup>i</sup>* (eigenvalues) are of interest in the form of: **S**{*φ*}*<sup>i</sup>* = *λi*{*φ*}*<sup>i</sup>* . Lanczos' algorithm provides a tri-diagonal matrix, **T**, at the end of any step, in which its extreme eigenvalues approximate the extreme eigenvalues of matrix **S**. To determine these eigenvalues, let us suppose that **q**<sup>1</sup> is a random vector with |**q**1| = 1. Next, let us compute the *δ* and *β* values from Algorithm 1. Then, develop matrix **T** and identify the eigenvalues. Being tri-diagonal facilitates the

eigen-decomposition of the matrix when using the power method. Note that **q***<sup>i</sup>* should be orthogonal to all other **q** vectors.

$$\mathbf{T} = \begin{bmatrix} \delta\_1 & \beta\_2 & & & & \\ \beta\_2 & \delta\_2 & \beta\_3 & & & \\ & \beta\_3 & \ddots & \ddots & & \\ & & \ddots & \ddots & \ddots & \\ & & & \ddots & \delta\_{m-1} & \beta\_m \\ & & & & \beta\_m & \delta\_m \end{bmatrix} \tag{8}$$

**Algorithm 1** Computation of *δ* and *β*.


#### *2.4. Hybrid Method*

The objective of this paper was to combine the random fields concept with PCE to quantify the uncertainties in the heterogeneous dam models. Moreover, the developed surrogate model will be used for sensitivity analysis of the most important regions within dam body. The initial framework of the uncertainty quantification introduced by Sudret [39] and used by Hariri-Ardebili and Sudret [14] for RV-based assessment of dam structures is expanded in this paper to incorporate the impact of spatial correlation. This framework has three main elements as illustrated in Figure 2:

**Figure 2.** A framework for uncertainty quantification with random fields; adapted from Reference [40] and expanded.


• Step C: Perform an uncertainty analysis that combines the input uncertainties with the computational model, and quantifies characteristics of the stochastic system. The PCE meta-model, <sup>M</sup>*PCE*, will be used for this purpose.

The final goal in this paper (as discussed in Section 4) is to answer this important question: "Is it possible to use the PCE-assisted surrogate model for sensitivity and uncertainty quantification of heterogeneous concrete dam models?" Moreover, the findings will be verified by classical random forest method (as discussed in Section 5).

#### **3. Case Study Dams**

In order to evaluate the capability of PCE in conjunction with random fields in dam engineering problems, two case study dams are studied in this paper. Those two case studies have different number of random variables. Each random variable is indeed a region/element within the dam which reflects the spatial variability of material properties. In both dams, only the concrete modulus of elasticity is assumed to be random variable (due to potential aging and, therefore, reduction of elasticity), while all other material properties are kept in their mean value. One may note that concrete strength is also random (and typically is reduced due to aging); however, it is not used during modal analysis.

In the first one, a coarse mesh is used to reduce the number of input parameters in the PCE meta-model. This symmetry model (hereafter Dam-1) includes only 72 random variables. This example is representative of dams which are constructed with different types of cement (and/or concrete strength) at different locations (typically various concreting regions). Modulus of elasticity, mass density, and Poisson's ratio of dam body are 25 GPa, 2450 kg/m<sup>3</sup> , and 0.17, respectively.

The second example is an un-symmetry dam (hereafter Dam-2) with finer mesh (316 input parameters). This example challenges the capability of the developed metamodel for high-dimensional problems, and can be representative of a dam suffering from non-uniform aging/deterioration. Modulus of elasticity, mass density, and Poisson's ratio of dam body are 19 GPa, 2450 kg/m<sup>3</sup> , and 0.18, respectively. Finite element models of the case study dams are illustrated in Figure 3. The slenderness coefficient of Dam-2 is 13.8 compared to 9.9 in Dam-1. Figure A1 illustrates the first ten vibration modes in both dams.

**Figure 3.** Finite element model of the case study dams with main dimensions (not to scale).

#### **4. Results: Correlation-Based vs. Variance-Based Decomposition**

This section will discuss the results of implementation of PCE surrogate model to quantify the sensitivity of the spatial location in arch dams for various vibration modes.

#### *4.1. Dam 1*

PCE meta-models are developed based on LAR method. A fixed truncation strategy along with a degree-adaptive sparse PCE option are applied. Hence, *q*−norms equal to 0.75, and polynomial degrees *p* set to <2:5>. Three QoIs are evaluated: frequency of vibration, effective mass, and participation factor.

Different DOEs (with 100 to 1000 points) are used to evaluate the sensitivity and accuracy of the prediction. Figure 4 compares the response prediction via PCE and the corresponding expansion coefficients. Five initial (and deterministic) design of experiment (DOE) values are considered, i.e., 100, 200, 400, 800, and 1000. Clearly, increasing the initial sample size, increases the accuracy of PCE meta-model. In addition, it increases the number of expansion coefficients required for the meta-modeling. Qualitatively, the performance of the PCE for frequency response is very similar to the effective mass. The results of participation factor are not shown as they are very similar to other two responses.

**Figure 4.** Dam-1; Polynomial Chaos Expansion (PCE)-based surrogate models for varying number of initial DOEs = 100, 200, 400, 800, and 1000 (left to right).

Figure 4 only studied the 1st vibration mode. Next, the effects of higher-modes are evaluated. A batch with 800 initial DOEs is considered to be the pilot Dam-1 model, and the results are expanded for 20 modes. In each case, a similar PCE model as of mode #1 is implemented. To reduce the uncertainties due to initial random selection of DOE points, a total of 20 replications is performed. Therefore, the developed meta-models have a probabilistic nature; see Figure 5.

According to Figure 5a,b the mean and variance increase as a function of frequency. While the mean frequency has a quite uniform form, there are some nonlinear variations for the frequency variance (with some spikes). Based on Figure 5c, the non-zero coefficients is constant for the first 4 modes (and about 280), then it drops to about 180 at mode #8 and keeps a nearly constant trend. There is a considerable probabilistic dispersion in the

non-zero (NnZ) value depending on the initial sampling batch. Finally, the LOO error has an increasing trend with some big spikes and small dispersion; see Figure 5d.

As opposed to frequency response, there is not a clear trend for the effective mass in the case of mean and variance response; see Figure 5e,f. Moreover, the NnZ has a fluctuating pattern and from mode #1 to #20 decreases from about 270 to 220; see Figure 5g. The LOO error has an oppose behavior with respect to NnZ and increases with a fluctuating fashion; see Figure 5h.

**Figure 5.** Dam-1; Probabilistic PCE-based surrogate models for different vibration frequencies and quantity of interests (QoIs) all with 800 DOE; *ω*: frequency.

Finally, the impact of the DOE size on the quality of meta-model is studied in Figure 6 (only for the frequency response). The mean (not shown here) of different sample sizes are practically identical. The variance values have also similar trend for DOEs from 200 to 1000, while the model with DOE = 100 has different pattern; see Figure 6a. Increasing the DOE size, increases the number of NnZ coefficients; see Figure 6b. Finally, the LOO error decreases with an increase in the DOE size.

**Figure 6.** Dam-1; impact of DOE size on the PCE-based surrogate models for frequency response.

Next, the sensitivity of QoIs are computed with respect to the input RVs. The linear correlation is the simplest method to evaluate the sensitivity; however, it might be inaccurate in the presence of strongly nonlinear dependence between RVs. The Spearman's rank

correlation index, *ρS*, is a stable version which accounts for monotonicity instead of the linearity of the dependence between RVs. First, all the input and outputs are transformed into their rank-equivalents:

$$R\_i = \left\{ r\_i^{(j)} \in \{1, \dots, N\} : r\_i^{(j)} > r\_i^{(k)} \Leftrightarrow \mathbf{x}\_i^{(j)} > \mathbf{x}\_i^{(k)} \forall j, k \in \{1, \dots, N\} \right\}. \tag{9}$$

In the same way, the rank-transformed model response is *R<sup>Y</sup>* = n *r* (1) *Y* ,*r* (2) *Y* , ...,*r* (*N*) *Y* o . Finally, the Spearman's rank correlation indices are:

$$\rho\_{S\_i} = \rho(R\_i, R\_Y) = \frac{\mathbb{E}\left[ (R\_i - \mu\_i) \left( Y - \mu\_{R\_Y} \right) \right]}{\sigma\_i \sigma\_{R\_Y}},\tag{10}$$

where *µ* is expectation of that quantity, and *σ* refers to its standard deviation.

On the other hand, a direct outcome of a PCE-based surrogate model is to use the Sobol indices as a metric for sensitivity analysis. Decomposition of variance provides the sensitivity measures in the form of *Si*<sup>1</sup> ,...,*i<sup>s</sup>* which represent the relative contribution of each group of RVs *Xi*<sup>1</sup> , ..., *Xi<sup>s</sup>* to the total variance. The first order Sobol index is derived with respect to one RV *X<sup>i</sup>* (and its interaction with other RVs is neglected – which is referred to as high-order Sobol indices). In the functional form the sensitivity index is:

$$S\_{i\_1,\dots,i\_s} = \frac{D\_{i\_1,\dots,i\_s}}{D},\tag{11}$$

where the partial variances are:

$$D\_{\mathbf{i}\_{1},\ldots,\mathbf{i}\_{s}} = \int \ldots \int f\_{\mathbf{i}\_{1},\ldots,\mathbf{i}\_{s}}^{2} (\mathbf{x}\_{\mathbf{i}\_{1}}, \ldots, \mathbf{x}\_{\mathbf{i}\_{s}}) d\mathbf{x}\_{\mathbf{i}\_{1}} \ldots d\mathbf{x}\_{\mathbf{i}\_{s}} \quad 1 \le \mathbf{i}\_{1} \le \ldots \le \mathbf{i}\_{s} \le M; \ s = 1,\ldots,M,\tag{12}$$

where *f*(**X**) presents the Sobol decomposition

Figure 7 illustrates both the "Correlation Rank" and first order Sobol index for 10 frequencies and 72 RVs (which corresponds to spatial locations). As seen in this figure, the variation of colors which corresponds to sensitivity index is much higher in correlation rank method compared to first order Sobol index. Moreover, the correlation rank method has higher relative values. Estimation of the high sensitive RVs (locations) in both methods is close. It seems that the Sobol index filters more the RVs compared to correlation rank. Since each vibration mode is analyzed independent from others, the relative sensitivity index from one mode should not be compared to other modes (only the pattern is important).

**Figure 7.** Dam-1; comparison of sensitivity indices for two methods and 10 modes.

Figure 8 compares the spatial distribution of sensitivity indices for two methods and only first six modes of vibration. The main observations are as follows:


non-uniform distribution of the sensitivity index within dam body. The results from Sobol index are quite symmetry.

• According to these figures, the regions of the dam in vicinity of abutments are most sensitive locations for the modes #1, #3, and #6. On the other hand, the middle regions located at the upper one-half are most sensitive for modes #2, #4, and #5. The lower half part of dam is not sensitive at all in its overall vibration characteristics.

**Figure 8.** Dam-1; Spatial distribution of sensitive location for different vibration modes.

#### *4.2. Dam 2*

A similar method as of Dam-1 is used for Dam-2. Figure 9 compares the response prediction via PCE (while the expansion coefficients will be omitted). Four initial (and deterministic) DOE values are considered, i.e., 500, 1000, 2000, and 3000. Aging, increasing the initial sample size, increases the accuracy of PCE meta-model, as well as the number of expansion coefficients. However, compared to Dam-1, the Dam-2 requires much more DOE to provide a good prediction. In addition, the performance of the PCE for frequency response seems to be better than effective mass (this was not the case in Dam-1).

Figure 9 only studied the 1st vibration mode. Therefore, the higher-modes effects will be evaluated by analyzing first 20 modes. A batch with 2000 initial DOEs is considered to be the pilot Dam-2 model. In each case, a similar PCE model as of mode #1 is generated; however with 20 replications (to reduce the potential uncertainty due to random selection of DOEs); see Figure 10.

**Figure 9.** Dam-2; PCE-based surrogate models for varying number of initial DOEs.

**Figure 10.** Dam-2; Probabilistic PCE-based surrogate models for different vibration frequencies and QoIs all with 2000 DOE.

According to Figure 10a,b, the mean and variance increase as a function of frequency (this response is similar to Dam-1). While the mean frequency has a quite uniform form, there are some nonlinear variations for the frequency variance (with some spikes). Based on Figure 10c, the non-zero coefficients are increased from initial about 340 to about 420 in mode #8. Then, it has a fluctuating behavior around 360 NnZ (this observation is different from Dam-1). Finally, the LOO error has also a non-uniform and non-constant pattern (as opposed to Dam-1); see Figure 10d.

Similar to Dam-1, in Dam-2, also, there is not a clear trend for the effective mass in the case of mean and variance response; see Figure 10e,f. Both the NnZ and LOO error have also a non-uniform pattern.

Finally, the impact of the DOE size on the quality of meta-model is studied in Figure 11 (only for the frequency response). The variance values have a similar trend for DOEs from 1000 to 3000, while the model with DOE = 500 is a bit different; see Figure 11a. Increasing the DOE size, increases the number of NnZ coefficients; see Figure 11b (as opposed to Dam-1, the variation of NnZ is nearly constant). Finally, the LOO error decreases with an increase in the DOE size (similar to Dam-1).

Figure 12 compares both the Correlation Rank and 1st Sobol index for 10 frequencies and 316 RVs (which corresponds to spatial locations). Again, variation of color contour is much more for Correlation Rank method compared to 1st Sobol index (which highlights only a few locations). Qualitatively, the predicted sensitive locations are different in two methods.

**Figure 11.** Dam-2; impact of DOE size on the PCE-based surrogate models for frequency response.

**Figure 12.** Dam-2; comparison of sensitivity indices for two methods and 10 modes.

Figure 13 compares the spatial distribution of sensitivity indices for two methods using the first six modes of vibration. As opposed to Dam-1 in which there was only one layer of element withing the thickness, Dam-2 has two layers of elements. This provides the opportunity to investigate the curvature of mode shapes; thus, the sensitivity of the upstream and downstream side elements for various modes. The main observations are as follows:


21

**Figure 13.** Dam-2; Spatial distribution of sensitive location for different vibration modes.

#### *4.3. Spectrum of Sensitivity Indices*

The summation of Sobol indices for all random variables is equal to one. Figure 14 summarizes the most important observations in this paper which is the spectrum of cumulative first order Sobol indices for the first 10 modes for each dam. Any sharp jump in these curves or a very steep slope indicate the importance of that particular random variable (i.e., location). These curves helps to identify the sensitive locations in dam for further inspection and potential instrumentation.

**Figure 14.** Spectrum of cumulative first order Sobol indices along number of random variables, *NRV*, for various modes.

#### **5. Results: Random Forest-Based Ensemble Regression**

Random forest [41] is an ensemble method that builds a large collection of decorrelated trees, and then averages them [42]. Random forest is one of the base and standard methods in machine learning which is used in this section to validate further the results from PCE. Aside from random forest applications in classification and regression problems, it has a variable importance feature too. It is founded on the Out-of-Bag (OOB) data concept which refers to the samples that are not selected in bootstrapping in the random forest procedure. Error is then calculated by calculating random forest model with OOB data. Mathematically, the importance score (IS) is described as:

$$\text{IS} = \frac{1}{N\_t} \sum\_{j=1}^{B} \left( E\_j - E\_j^\* \right)\_{\prime} \tag{13}$$

where *N<sup>t</sup>* is the number of trees in the model, and *E<sup>j</sup>* and *E* ∗ *j* are the errors of each tree applied on OOB and perturbed OOB data, respectively. This method has been implemented on the data from both dams using the "randomForest" function [43] available in R with "ntree = 500".

Figure 15 presents the estimated variable importance by random forest method for two dams. In general, the random forest pattern is similar to PCE rather than Correlation Rank. In addition, it seems that random forest assigns higher weight to some RVs (regions) and, thus, filters less important variables. In random forest, the local variable importance is shown graphically shown using tress. Tree can be shown either based on absolute or relative values. Figure A2 presents trees for first six modes of Dam-1 and Dam-2 based on the normalized modulus of elasticity value (which has a heterogeneous distribution). At any regression tree, the value of the random variable (in this case, 1 to 72 for Dam-1 and 1 to 316 for Dam-2 in the form of RV∗ ∗ ∗) is checked, and depending of the (binary) answer, the tree grows to the left or right sub-branch. Once a tree reaches to any of its leaves, the estimated value (in this case, the vibration frequency) is obtained. As opposed to linear or polynomial regression, which are global models (the meta-model is supposed to hold in the entire data space), trees partition the data space into small enough parts where one may apply a simple different model on each part.

Figure 16 illustrates the spatial distribution of variable importance for both dams based on random forest method. In general, random forest provides the spatial distribution in a very similar way to first order Sobol index. The only difference is that first order Sobol index provided a complete symmetry sensitivity prediction for Dam-1, while random forest exhibits small un-symmetry behavior (in any case is un-symmetry level is much less that Correlation Rank estimation). In the case of Dam-2, it seems that random forest filters even more the insensitive RVs (regions) and strictly focuses on few highly important RVs.

**Figure 15.** Estimation of variable importance using random forest for two dams and 10 modes.

**Figure 16.** Random forest-based spatial distribution of sensitive location for different vibration modes in Dam-1 and Dam-2.

Finally, Figure 17 compares the relative importance of RVs (72 RVs in Dam-1 and 316 RVs in Dam-2) from random forest (horizontal axis) with correlation Rank (left vertical axis) and PCE-based first order Sobol index (right vertical axis). As seen, there is a high correlation between random forest and PCE and coefficient of determination (R2) of 0.99 for Dam-1 and 0.95 for Dam-2. On the other hand, the random forest is less correlated with Correlation Rank method and the average of R2 is about 0.85 for Dam-1 and 0.70 for Dam-2. The reported goodness-of-fit metrics are based on a fitted power-form equation, *ax<sup>b</sup>* + *c*, which shows a completely nonlinear relation between Correlation Rank and random forest, and a semi-linear one between PCE and random forest.

**Figure 17.** Correlation among three sensitivity metrics for various vibration modes.

#### **6. Connection to the System Identification and Dynamic Analysis**

Arch dams are three-dimensional shell-type structures with a large degree of indeterminacy. Anatomy of vibration in concrete arch dams is complex and requires special techniques to solve the eigenvalue problems. Typically, the material properties in concrete are assumed to be isotropic and homogeneous which facilitates the coupled modal analysis. However, it is well-accepted that the material properties within the dams (and in general any the large-scale infra-structures) is not quite homogeneous.

The structural-level heterogeneity might be due to differences in concreting in large scale, or deterioration/aging over the time. The latter one is typically observed in dams due to alkali aggregate reaction. The concrete heterogeneity will alter the vibration nature of the structure (compared to the homogeneous one) [44]. The U.S. bureau of reclamation [9] conducted a series of seismic tomography tests on Seminoe Dam, located in Wyoming, which is suffering from ASR. While the lower parts of the dam have a modulus of elasticity of about 20 GPa, it reduces to only 9 GPa in the vicinity of crest with a highly heterogeneous pattern. Custódio et al. [45] studied the performance of 80 m high Miranda buttress dam. Evidences of AAR has been found due to progressive vertical displacements upwards, as well as vertical sliding on the central contraction joints. The cylindrical test results, show that there is a great variability on the condition of the concrete sampled throughout the structure. Compressive strength is ranging from 20 to 48 MPa; tensile splitting test from 1.7 to 4.1 MPa; and stiffness damage test from 23 to 37 GPa.

The outcome of this study can improve the results of dynamic analysis in different aspects. During the model calibration (e.g., based on forced vibration test), the results of sensitivity analysis help to minimize the computational time for system identification by mainly focusing on the most important regions. In this approach, instead of assigning thousands potential heterogeneous patterns to the dam body, the calibration is only conducted by searching the right material properties in the sensitive regions for that particular vibration mode.

Next, the outcome is used in dynamic analysis. Two widely used finite element dynamic analysis techniques for concrete dams are: response spectrum analysis and time integration method.

• In the response spectrum analysis technique, the response of the coupled system (i.e., displacement and stress) is separately computed for each vibration mode, and then are

combined to calculate the total system level response. Using a proper heterogeneous model with exact effective mass and participation factor improves the accuracy of total calculations. In addition, it is important to identify the most effective vibration modes which contribute most to the dynamic response of the system. The number of effective modes (which might be different in heterogeneous models compared to the homogeneous ones) is typically selected to reach at least 90% total accuracy.

• In the time-integration time history analysis, the heterogeneous dam assumption alters the stiffness matrix and damping values of at different modes.

#### **7. Summary and Conclusions**

Quantification of structural vibration characteristics is an essential task prior to perform any dynamic health monitoring, and system identification. Most of the finite element models are validated based on the vibration characteristics of the dam, thus ignoring concrete heterogeneity may lead to poor/wrong model validation.

This paper proposed a hybrid random field—polynomial chaos expansion surrogate model for uncertainty quantification and sensitivity assessment of dam structures. First, an efficient random fields model is developed and the concrete heterogeneity is simulated. Next, two dam models (symmetry and un-symmetry) are selected and a large number of eigenvalue analyses were performed on the models. From the analyses, the vibration characteristics (i.e., frequencies, effective masses, and participation factors) were extracted. This large database is used to develop a PCE-based surrogate model. A natural outcome of such a meta-model is to quantify the Sobol indices in various random variables (in this case, different locations). This is a great metric to identify the most sensitive dam locations for different vibration modes.

Another important conclusion in this figure is that the general trend of a symmetry dam is different from a un-symmetry one because in the former one the modes are wellseparated, while, in the latter case, the modes are interacting. Comparing the PCE-based sensitivity analysis with correlation rank method showed that the latter one does not provide a reliable estimate of the important random variables. Furthermore, the results of the proposed hybrid model is validated using the classical random forest regression method, and a good consistency has been reported.

**Author Contributions:** Conceptualization, M.A.H.-A.; methodology, M.A.H.-A.; software, M.A.H.-A., G.M., A.A. (Azam Abdollahi) and A.A. (Ali Amini); validation, M.A.H.-A. and G.M.; formal analysis, G.M., A.A. (Azam Abdollahi) and A.A. (Ali Amini); investigation, M.A.H.-A., G.M., A.A. (Azam Abdollahi) and A.A. (Ali Amini); writing—original draft preparation, M.A.H.-A., G.M., A.A. (Azam Abdollahi) and A.A. (Ali Amini); writing—review and editing, M.A.H.-A.; visualization, M.A.H.-A., G.M., A.A. (Azam Abdollahi) and A.A. (Ali Amini); supervision, M.A.H.-A. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Data Availability Statement:** The data presented in this study are available on request from the corresponding author.

**Conflicts of Interest:** The authors declare no conflict of interest.

**Appendix A. Dam Mode Shapes**

**Appendix B. Detailed Regression Trees**

**Figure A2.** Regression trees to identify the important random variables (RVs) (i.e., locations) at different vibration modes in Dam-1 and Dam-2.

#### **References**


## *Article* **Accounting for Uncertainties in the Safety Assessment of Concrete Gravity Dams: A Probabilistic Approach with Sample Optimization**

**Rocio L. Segura <sup>1</sup> , Benjamin Miquel <sup>2</sup> , Patrick Paultre 1,\* and Jamie E. Padgett <sup>3</sup>**


**Abstract:** Important advances have been made in the methodologies for assessing the safety of dams, resulting in the review and modification of design guidelines. Many existing dams fail to meet these revised criteria, and structural rehabilitation to achieve the updated standards may be costly and difficult. To this end, probabilistic methods have emerged as a promising alternative and constitute the basis of more adequate procedures of design and assessment. However, such methods, in addition to being computationally expensive, can produce very different solutions, depending on the input parameters, which can greatly influence the final results. Addressing the existing challenges of these procedures to analyze the stability of concrete dams, this study proposes a probabilistic-based methodology for assessing the safety of dams under usual, unusual, and extreme loading conditions. The proposed procedure allows the analysis to be updated while avoiding unnecessary simulation runs by classifying the load cases according to the annual probability of exceedance and by using an efficient progressive sampling strategy. In addition, a variance-based global sensitivity analysis is performed to identify the parameters most affecting the dam stability, and the parameter ranges that meet the safety guidelines are formulated. It is observed that the proposed methodology is more robust, more computationally efficient, and more easily interpretable than conventional methods.

**Keywords:** gravity dams; safety assessment; probabilistic analysis; parameter uncertainty; sample optimization; variance-based sensitivity analysis

#### **1. Introduction**

Dams are a vital part of the nation's infrastructure, providing economic, environmental, and social benefits. The benefits of dams, however, are countered by the risks they can present [1]. The structural stability of major dams needs to be re-evaluated every 5–10 years according to hazard classification systems (HCSs), most often within the legal framework of a governmental regulatory agency [2]. Requirements for the stability of concrete dams in the current regulations are based on simplifications, which, in many cases, are very conservative. Concrete dams in Canada, as in most of the world, are designed and assessed based on a deterministic framework using safety factors (SFs). Although the failure rate of gravity dams is low, the deterministic approach suffers from several problems, including the equal treatment of loads and the identical consideration of strength and capacity uncertainties, which are combined into a single safety factor. As a consequence, unnecessary rehabilitation works may be carried out on dams that are safe but do not meet the safety requirements. When safety is re-evaluated, it is important that this evaluation is based on modern safety concepts, such as a probabilistic analysis, to support decision-making [3].

In contrast to the deterministic approach, the probabilistic approach requires the treatment of each parameter as a continuous function that associates a probability of occurrence

**Citation:** Segura, R.L.; Miquel, B.; Paultre, P.; Padgett, K.E. Accounting for Uncertainties in the Safety Assessment of Concrete Gravity Dams: A Probabilistic Approach with Sample Optimization. *Water* **2021**, *13*, 855. https://doi.org/10.3390/ w13060855

Academic Editors: M. Amin Hariri-Ardebili, Fernando Salazar, Farhad Pourkamali-Anaraki, Guido Mazzà and Juan Mata

Received: 25 January 2021 Accepted: 15 March 2021 Published: 20 March 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

to the distribution. This probability density function (PDF) allows variables to be treated as uncertain inputs by directly incorporating uncertainties into the model evaluation process [4]. The uncertainties are propagated through the system to obtain a quantitative estimate of the probability of exceeding a specific loading scenario or system configuration. Although probabilistic methods have been considered as a promising alternative for the safety assessment of dams, such methods often require a large number of simulations. Moreover, in the context of a probabilistic analysis, new information gained through laboratory tests, in situ tests, empirical observations, etc., could update the prior knowledge assumed for different variables in the PDFs. However, this is frequently overlooked given the costly re-evaluation of these simulations. Innovations, such as the use of machine learning techniques [5–7], have been proposed to overcome these drawbacks. However, such procedures can be subject to misinterpretation if not applied correctly due to their complexity. Thus, there is a need to develop simplified and more expeditious methods for analyzing the safety of dams within a probabilistic framework.

To further encourage the applicability of these methods, the main objective of this study is to develop a probabilistic-based methodology to assess the safety of dams that allows analysis updating in light of new information, preventing the need for the reevaluation of the system simulations. Additionally, an efficient progressive sampling strategy that sequentially generates sample points while progressively preserving the distributional properties of interest is used to optimize the sample size and avoid unnecessary simulation runs. As a result, after the final number of simulation runs is established, variations in the loading conditions are considered by modifying the parameter's cumulative density function defining the annual probability of exceedance. In this manner, there is no need to re-run simulations when the system's loading parameters vary. These simulations are then used to estimate the probability of exceeding a target monitored response for a given loading scenario. In the same way, variance-based global sensitivity analyses with progressive complexity are performed to identify the parameters most affecting the dam stability, and ranges of values that satisfy the SFs provided by safety guidelines are formulated. The proposed methodology is applied to a case study gravity dam located in north-eastern Canada.

#### **2. Related Works**

In recent decades, the knowledge in the evaluation of natural hazards has evolved considerably, making it necessary to reassess the stability of dams under usual, unusual, and extreme loading. As mentioned above, methods for analyzing the structural stability of a dam system rely on deterministic or probabilistic approaches. Deterministic analysis has traditionally been used to assess the stability of dams [8–11]. Nevertheless, these methods are often considered overly conservative or even unsafe in some cases because they neglect the different sources of uncertainty and because of the use of extreme load cases with very low probabilities of occurrence [12–14]. Thus, there is interest in moving towards more refined methods for considering uncertainties. For these reasons, probabilistic assessment has emerged as a useful tool in dam safety, and the results have been found to be promising by recent studies [2,3,15–19]. However, the use of probabilistic methods for dams within a normative framework is not well developed, but increasing scrutiny is being applied to this field. The most recent guidelines for the design and assessment of gravity dams are now including probabilistic notions for the assessment of these structures [4,14,20–24].

For a reliable probabilistic analysis, the emphasis must be placed on the quality of the input parameters and in particular the uncertainties. However, probabilistic assessment, no matter how sophisticated, can still lead to very different solutions for a given problem because of the complex choices of random variables (RVs), characteristic values, PDFs, and bounds, which can largely influence final results [25–27]. Consequently, the analysis is generally not updated in light of new information due to the time-consuming re-evaluation and the lack of flexibility in the methods regarding including modified PDFs and bounds. Indeed, although the information contained in the probabilistic results is far-reaching, it is

still currently difficult to carry out an in-depth study to assess the safety of dams according to all scenarios. Accordingly, together with the probabilistic approach, the deterministic method can be used to complement the safety assessment [28]. To this end, a preliminary RV selection phase or sensitivity is sometimes proposed prior to probabilistic analyses. Tornado diagrams (TDs) [29] are an example of the deterministic (or semideterministic if the input variables are PDFs) sensitivity methods that have been widely used for the efficient selection of leading variables [30–33]. Similarly, more refined methods, such as analysis of variance (ANOVA) and Sobol's method [34], have been used for assessing the significance of each modeling parameter on the structural responses of dams [17,35,36].

#### **3. Methodology**

Probabilistic analysis identifies the uncertainties that are key for safety and attempts to include all plausible scenarios, their likelihood and their consequences. It yields more comprehensive estimates than deterministic analysis due to the range associated with the input variables. However, it is undeniable that deterministic analysis, in terms of SFs, is still the most widely used method for design and assessment in the dam industry [37]. With this in mind, the two approaches are combined in this study to provide a better understanding of the output of a probabilistic analysis in terms of practical considerations.

The main steps of the proposed methodology are shown in Figure 1; the methodology is divided into three stages: (i) pre-processing (steps 1–3), (ii) processing (steps 4–6), and (iii) post-processing (steps 6–9). In the pre-processing stage, the load and resistance input parameters that can be considered RVs and their respective PDFs are defined. Next, a prescreening of the model parameter sensitivity is performed by generating TDs to define the final set of RV, PDFs, and maximum and minimum bounds. Then, to optimize the computational resources, a progressive design of experiments (DOE) strategy based on the the progressive Latin hypercube sampling (PLHS) [38] technique is employed. Concerning the processing stage, a numerical model of the system is developed. Subgroups of the total number of simulations are analyzed sequentially, and the error in each iteration is compared to a specific tolerance, which when satisfied, determines the final sample size. Finally, in the post-processing stage, safety recommendations are formulated by evaluating the system output, the probability of exceeding a target performance indicator conditioned on a load combination (LC) is estimated, and ANOVA is performed to assess the global parameter significance while Sobol's indices are estimated to quantify their importance. In the next sections, each of these steps is explained in detail.

**Figure 1.** Probabilistic-based procedure.

#### *3.1. Sampling Strategy*

An efficient sampling strategy that scales with the size of the problem and computational resources is essential for various sampling-based analyses, such as sensitivity and uncertainty analyses. To this end, as the sample size increases, PLHS [38], which sequentially generates sample points while progressively preserving the distributional properties of interest (Latin hypercube properties, space-filling, etc.), is used. PLHS generates a series of smaller subsets (slices) such that (i) the first slice is a Latin hypercube, (ii) the progressive union of slices remains a Latin hypercube and achieves maximum stratification in any one-dimensional projection, and, as such, (iii) the entire sample set is a Latin hypercube. To optimize the sampling size, a maximum number of permitted simulations, *N<sup>s</sup>* , is established. The total number of permitted simulations is then divided into *n* slices, each containing *n<sup>s</sup>* = *Ns*/*n* samples. The simulations are run iteratively, one slice at a time, and the results for each iteration are cumulatively saved. To define the final sample size, the algorithm starts with one slice. In the next step, another slice is added, and the convergence of the algorithm is measured by calculating the relative errors presented in Equations (1)–(3) and comparing them to a tolerance of 1 <sup>×</sup> <sup>10</sup>−<sup>3</sup> .

$$E\_{\mu-\text{SSF}} = \left| 1 - \frac{\frac{\sum\_{i=1}^{(n-1)\times n\_s} \text{SSF}\_i}{(n-1)\times n\_s}}{\frac{\sum\_{i=1}^{n\times n\_s} \text{SSF}\_i}{n\times n\_s}} \right| \tag{1}$$

$$E\_{\sigma-\text{SSF}} = \left| 1 - \frac{\sqrt{\frac{\sum\_{i=1}^{(n-1)\times n\_s} \left(\text{SSF}\_i - \mu\_{\text{SSF}}(n-1)\right)^2}{(n-2)\times n\_s}}}{\sqrt{\frac{\sum\_{i=1}^{n\times n\_s} \left(\text{SSF}\_i - \mu\_{\text{SSF}n}\right)^2}{(n-1)\times n\_s}}}\right|},\tag{2}$$

$$E\_{\text{SSF3}} = \left| 1 - \frac{\frac{\sum\_{i=1}^{(n-1)\times n\_s} \text{I}\_{d3,i}}{(n-1)\times n\_s}}{\frac{\sum\_{i=1}^{n\times n\_s} \text{I}\_{d3,i}}{n\times n\_s}}\right|.\tag{3}$$

where SSF is the sliding safety factor for the normal load case, *n<sup>s</sup>* is the number of samples per slice, *<sup>µ</sup>*SSF(*n*−1) and *µ*SSF*<sup>n</sup>* are the mean safety factors calculated with *n* − 1 and *n* slices, respectively, and I*d*<sup>3</sup> is an indicator function, where I*d*3,*<sup>i</sup>* = 1 if SFF*<sup>i</sup>* < 3, and I*d*3,*<sup>i</sup>* = 0 otherwise. The considered stopping conditions concern the statistics (mean and standard deviation) of the model response in Equations (1) and (2) and the probability of presenting a SSF lower than a given value in Equation (3). It should be noted that in Equation (3), a threshold of 3 is used because the SFs from the usual load case are considered, as will be explained later in Section 4.2. Usually, the convergence of the SSF probability is slower than that of the statistical moments, such that the obtained results could be accurate for the mean and standard deviation, but not for the SSF probability, especially for low probabilities. Therefore, Equations (1)–(3) are used together to evaluate the convergence of the algorithm.

#### *3.2. Sensitivity Analysis*

After the selection of the initial set of model parameters that can be considered to be RVs in the analysis, a prescreening is performed by generating TDs to determine the final set. This semideterministic analysis is used to evaluate a broad scenario trade space and narrow the set of options to those that are viable, given the performance, cost, and safety constraints. Because deterministic analysis requires a relatively short computational time, it is possible to rapidly iterate through possible scenarios at this low level of fidelity. However, it is difficult for TDs to evaluate the effect of simultaneous variation in a large

number of input parameters on the model output results. To this end, after the simulations, the analysis is expanded, and ANOVA is performed to understand the global parameter significance and to draw inferences about the effect of the joint variation of the parameters on the target output.

#### 3.2.1. Tornado Diagrams

TDs quantify the impact of single RV variations on the target output results. It is also very useful tool to identify which variables are worth resource investment regarding reducing uncertainties. A TD is composed of horizontal bars with widths given by the sensitivity of a specific RV. These bars are sorted vertically from the most influential at the top of the diagram, to the least sensitive at the bottom; thus, the diagram looks like a tornado. Figure 2 presents the methodology, which can be explained as follows: (i) for each RV, the mean value and the 5–95% confidence interval (CI) are determined; (ii) numerical simulations are carried out considering the CI bounds of a single parameter while keeping the remaining parameters at their mean value, i.e., for N input RV, 2N+1 analyses are performed; (iii) the difference in the results of the two extreme values of an RV gives the absolute value of variation. Next, the TD is constructed by sorting the parameters from the greatest to the lowest variation.

**Figure 2.** Tornado diagram (TD) methodology.

#### 3.2.2. Variance-Based Global Sensitivity Analysis

Even though TDs are easily interpretable and visually explicit, they can consider only variation in one parameter at a time, providing only a local sensitivity for each parameter. The concept of using the variance as an indicator of the importance of an input parameter is the basis for many variance-based global sensitivity analysis methods [39]. With this in mind, to evaluate the global significance of each parameter in the structure response by considering their joint variation, a sensitivity study using ANOVA is performed. ANOVA includes hypothesis tests that verify the significance of varying each parameter on the variance of the measured responses [40]. The results of the hypothesis tests are given in terms of a *p*-value, and a smaller *p*-value indicates greater evidence that the parameter has a strong influence on the dam response. A typical significance cut-off value of *α* = 0.05 is adopted here. Similarly, Sobol's method [34], another variance-based global sensitivity analysis, is applied because of its ability to quantify the importance of each input random variable. Sobol's method uses the decomposition of the variance to calculate the sensitivity indices.

#### *3.3. Conditional Probability of Exceedance Estimation*

Safety is subjective and is a matter of addressing public concern as specified in safety guidelines or regulations [15]. For the case of a deterministic analysis, this implies equating the SF to a formalized target criterion tolerated by the profession and society. With the move to a dam safety probabilistic-based approach, there has been a concomitant focus on estimating the probability of the failure of dams. In a probabilistic analysis, the decision of whether a dam is considered to be safe is made by comparing the calculated probability of failure with a stipulated tolerated probability of failure. The majority of risk guidelines

relate to the total probability of failure, which is difficult to interpret in practice [12]. With this in mind, the two approaches are combined in this study, and the probability of presenting a SF lower than the minimum value prescribed by safety guidelines is estimated to provide a better understanding of the probabilistic analysis output in terms of practical considerations. This probability, which is conditioned on a determined LC, is formulated according to Equation (4) from a frequentist point of view.

$$\mathbb{P}\_f(\text{SF} < \text{SF}\_i \mid \text{LC}) = \frac{\sum\_{j=1}^n \text{Samples}(\text{SF} \le \text{SF}\_i \mid \text{LC})}{\sum\_{j=1}^n \text{Samples}(\text{LC})},\tag{4}$$

where P*<sup>f</sup>* , the probability of exceeding a monitored response, is calculated as the number of samples exceeding a target SF (SF*<sup>i</sup>* ) for a specific LC over the total number of samples generated for that LC.

#### **4. Case Study**

#### *4.1. Numerical Model*

The present study is focused on a case study of a concrete gravity dam in Quebec, Canada. It is the largest gravity dam in the province, with 19 unkeyed monoliths, a maximum crest height of 78 m, and a crest length of 300 m (Figure 3a). The tallest monolith of the dam, with lift joints of 6 m, is selected as representative and modeled with the computer software CADAM3D [41] (Figure 3b), which performs stability analysis on gravity dams using the gravity method in accordance with the Canadian state-of-practice [9,11]. The validation of the numerical model was based on the fundamental period of the system and on global damping. By modifying the properties of the dam and of the foundation materials, the fundamental period and total damping of the system were 0.271 s and 1.05 % respectively, which matches the results from in situ forced vibration tests [42]. Additionally, to perform all the simulations required for a probabilistic analysis, a script is created with MATLAB to automate the model runs. Only one loading case was analyzed; this case includes the self-weight of the block, the hydrostatic load exerted by the reservoir on the block, the uplift pressures at the concrete-foundation contact and the ice load per unit length. The uplift pressure distribution is defined according to the United States Army Corps of Engineers (USACE) [11]. A nonlinear analysis that allows to consider the crack propagation along the lift joints is used to analyze the system response, where if the base crack extends beyond the drain, the full uplift pressure is considered in the crack.

**Figure 3.** Case study dam: (**a**) cross-section and (**b**) CADAM3D numerical model of the tallest monolith.

#### *4.2. Performance Indicators*

The overall stability of concrete retaining structures is verified by imposing performance criteria on predefined indicators to ensure that a sufficient margin of safety against failure exists for each of the failure mechanisms considered for the body of the system. The performance indicators used herein are (i) the sliding safety factor (SFF), (ii) the overturning safety factor (OSF), and (iii) the uplift safety factor (USF) and the position of the resulting force (PR). Table 1 shows the stability criteria for concrete gravity structures according to the Federal Energy Regulatory Commission (FERC) [43] and Canadian Dam Association (CDA) [9] guidelines. These guidelines propose SFs considering the level of knowledge in the strength parameters, where the required SFs are larger if no material tests are available. The LCs fall into three broad categories: normal, unusual, and extreme. These categories are related to a probability of exceedance for a period of time and at an acceptable level of safety. In the context of this study, only the conditions associated with water levels and ice loads are considered: normal operating conditions, unusual ice, or blocked drains and extreme safety floods.


† Friction and cohesion, ‡ Friction only.

#### *4.3. Modeling Parameters and Screening Study*

Each parameter in the analysis is defined either as a fixed value or as a RV. associated with a PDF. The preliminary set of considered parameters are selected taking into account the input parameters in the CADAM3D numerical model and the RVs considered in probabilistic analysis in the literature [15,19,22]. Table 2 presents the parameters that are considered as RVs in the numerical analysis of the dam response and for which the uncertainty or likelihood of occurrence is formally included. All the remaining input parameters are held constant and represented by their best estimate values. The probability distributions are defined using historical data from the case study dam and, when not available, empirical data from similar dams [8,44,45]. Based on literature results [46] and the dam owner's expert judgement, it is assumed for the sampling and posterior analysis that 65 % of the time, the concrete-rock contact is not bonded (apparent cohesion), while the remaining 35 % of time, a special treatment is present that ensures the bond (real cohesion). In the same manner, it is considered that the lift joints (concrete-concrete contact) are always bonded. Note that Table 2 shows that some of the model parameters are correlated with and/or conditional on each other provided that the aforementioned assumptions are made. For the bonded case, the base peak cohesion is taken as twice the base tensile strength, BCP<sup>R</sup> = 2×BRT, according to the Griffith criterion [47], while the base minimum peak compressive stress is null; hence, BMCP= 0. Conversely, for the unbounded case, the base tensile strength is null, i.e., BRT= 0, and the BMCP is normally distributed. Similarly, given that the base residual internal friction angle is always lower than or equal to the base peak friction angle, this parameter is considered equal to the peak friction angle minus a variation normally distributed. A uniform distribution is used for most of the parameters other than the minimum peak compressive stress so that more general cases can be considered in the analysis, as explained in the following sections.


**Table 2.** Uncertain parameters.

The base parameters refer to the concrete-rock contact. The joint parameters refer to the concrete-concrete contact.

A screening study based on the TDs to assess the effect of each modeling parameter on the response of the dam is displayed in Figure 4. The reservoir elevation (RN), drain efficiency (DEI), and ice load (GLN) are some of the most influential parameters common to almost every performance indicator. However, note that the parameters most affecting a given response vary with respect to the considered lift joint, as shown in Figure 4a–d and Figure 4b,c. For this reason, Figure 5 presents the variation in the SSF with respect to loading and material property parameters for each joint. The effects of the joint peak cohesion (JCP) and RN are more significant for the upper lift joints and, as expected, this is even more evident for the GLN, whereas the joint peak friction angle (JFP) remains almost constant.

**Figure 4.** TD: (**a**) Base joint—sliding safety factor (SSF), (**b**) Base joint—position of the resulting force (PR), (**c**) Base joint—uplift safety factor (USF), (**d**) Crest joint—SSF, (**e**) Crest joint—PR, and (**f**) Base joint—overturning safety factor (OSF).

#### *4.4. Load Combinations*

In the structural safety evaluation of dams, unusual and extreme loads need particular attention, as there is more uncertainty because of low event accuracy. The evaluation has to consider two aspects: (i) load uncertainty and (ii) event frequency [15]. Table 3 displays the event frequency associated with each LC according to the dam owners' internal regulations and based on the USACE guidelines [48].

**Table 3.** Probabilities associated with load combinations (LCs).


Taking into account the ice and reservoir load, Figure 6 presents the cumulative density function (CDF) associated with each load for determining the annual probability of exceedance. The red data dots are obtained based on the monitored reservoir values of the case study dam and the ice load traditionally used for each LC. To these points, a log-normal distribution is fitted in each case (LN1). However, to better consider the load uncertainty and given that we are fitting a CDF with only 3 points, two other log-normal distributions are considered by keeping the same mean but doubling the standard deviation and using a uniform distribution with the same mean and an upper bound equal to the extreme condition. Table 4 provides the parameters of these distributions.

**Table 4.** Loading parameter distributions.


**Figure 6.** Annual probability of exceedance: (**a**) RN and (**b**) GLN.

Figure 6a,b are combined to define the LCs by generating joint cumulative distribution functions, as shown in Table 5. Then, the joint cumulative distributions are intersected with horizontal constant probability planes equal to the annual probabilities defined in Table 3. Finally, the intersection is projected onto the XY plane and the regions corresponding to the usual, unusual, and extreme LCs are established, as shown in Figure 7. Figure 7a–d present these regions considering the independence between RN and GLN, whereas Figure 7e considers a negative correlation between RN and GLN, which provides a more realistic view of the case study dam's reservoir management. In the same manner, given that, for some performance indicators, DEI is more critical than GLN (Figure 4b,c,f), Figure 7f defines the LC regions as a function of RN and DEI. Likewise, the samples displayed in Figure 7 are obtained with an LHS strategy considering that the samples are drawn from the target distributions defined in Table 5, which are the same distributions as those used for defining the LC regions. Notably, the samples are not evenly distributed in the LC regions, which implies that there is not a sufficient number of samples for the probabilistic analysis for each LC. To address these two main drawbacks, i.e., to upgrade the LC definitions and/or different LCs for different performance indicators and a more regular sampling space without the re-evaluation of the simulations, the methodology proposed in Figure 8 is used. First, PLHS is used to generate samples to be simulated with the numerical model considering only the upper and lower bounds of the possible range of values of the parameters defining the LC. Then, in Step 2, the joint CDF is built considering the distributions assigned to each of these parameters, and the joint cumulative probability is calculated for the samples drawn in Step 1 to ensure that they follow the target distributions. Next, in Step 3, the LC regions are defined by projecting on the XY plane the intersection curve of the joint CDF with horizontal planes corresponding to the annual probability of exceedance (Table 3). Finally, in Step 4, only the samples in Step 1 that fall in each LC region, i.e., the samples in which joint cumulative probability correspond to the prescribed annual probabilities of exceedance, are considered for the post-processing stage. Figure 9 presents the final samples per load combination for D5.

**Figure 7.** LC regions for RN-GLN: (**a**) D1, (**b**) D2, (**c**) D3, (**d**) D4, (**e**) D5, and (**f**) D6.


† Estimated from expert judgement and the historical reservoir management of the case study dam.

**Figure 8.** Procedure for obtaining samples per LC.

**Figure 9.** Samples per LC for D5: (**a**) cumulative density function, and (**b**) XY projection.

#### **5. Results and Discussion**

#### *5.1. Sample Size*

In the context of this study, and as a trade-off between the available computational budget and time, a maximum number of 10<sup>4</sup> simulations is established. The total number of permitted simulations is divided into 100 slices, containing 100 samples each. The PLHS technique [38] is implemented via the VARS-Tool software package [49], which is a MATLAB Toolbox. Figure 10 presents the results of Equations (1)–(3) for each slice and for each joint. As seen from Figure 10, to obtain *Eσ*−SSF and *E*SSF3 within the specified tolerance, *<sup>n</sup>* <sup>=</sup> 50 slices are required, while, for *<sup>E</sup>µ*−SSF <sup>≤</sup> <sup>1</sup> <sup>×</sup> <sup>10</sup>−<sup>3</sup> , *n* = 60 are necessary. Even though the number of slices can be automatically determined by the algorithm, in this study, the number of slices is set to *n* = 55 given that with *n* = 50 the stopping condition in Equations (2) and (3) is already met. Ultimately, *N<sup>s</sup>* = 100 × 55 samples are considered for the probabilistic analysis.

#### *5.2. Effect of Model Demand PDF Variation in the Analysis*

To assess the effect of the model demand parameter definition in the probabilistic analysis, the probability of exceeding a target SF conditioned on the LC is estimated. This probability is calculated according to Equation (4) considering the load combination distributions presented in Table 5. Figure 11 displays the probability of exceeding an SSF prescribed by the CDA guidelines (Table 1) given an usual, unusual, and extreme LCs for each lift joint. Note that, for the usual LC, the last 5 bottom lift joints are affected, while, for the unusual and extreme LCs, only the concrete-rock (base) joint is affected. This can be explained considering the SSF associated to the load case. Given that the concrete-concrete joints in general presented 2 < SSF < 3, for a less critical load case, i.e., a higher target safety factor, the probability of presenting a safety factor lower than the threshold for the usual load case is higher, thus affecting more joints. For the unusual and extreme LCs, the probability is even lower due to the considered target SSF, and only the base joint which present different material properties than the concrete-concrete joints is affected. It can be concluded that, for unusual and extreme load cases, only the base joint is critical, while, for usual loading conditions, attention must be also paid to the last 5 bottom joints. Moreover,

it is observed that using distributions that more realistically represent the LCs, such as D5, which considers that during the summer, the reservoir is high, and the ice load is low, and vice versa, provide less conservative probabilities of exceedance. Only the functions corresponding to the sliding stability criteria are presented given that the numerical model simulations showed that OSF and USF are always respected, highlighting the adequate performance of the case-study dam.

**Figure 10.** Progressive Latin hypercube sampling (PLHS) slice iterations: (**a**) *Eµ*−SSF, (**b**) *Eσ*−SSF, and (**c**) *E*SSF3.

**Figure 11.** Conditional probability functions: (**a**) usual, (**b**) unusual, and (**c**) extreme.

### *5.3. Model Parameter Recommendations for Adequate Performance*

#### 5.3.1. Influence of Model Parameters

The significance of each modeling parameter in Table 2 on the structural response of the dam is also assessed by a screening study using ANOVA. For each response monitored, a multiway ANOVA is conducted using MATLAB. The results of this analysis are shown in Table 6, where the *p*-values < 0.05 indicate statistically significant parameters that should be treated with special attention. It should be mentioned that, for a pure sensitivity analysis, as is the case here, using uniform distributions is acceptable, but not for a reliability analysis since it is hard to find a physical parameter (e.g., the rock/concrete strength parameters) that follows a uniform distribution. The results are presented for the base and crest lift joints, which are considered as representative of the system.

Overall, the results reveal that all the parameters have a statistically significant effect on at least one of the critical dam responses. Additionally, the parameters used to define the LCs, such as RN, GLN, and DEI, are important for every SF at the base and/or the

crest. These results can be used to reduce the number of parameters considered in the probabilistic analysis; however, given that nearly all the parameters are identified as significant for at least one of the response quantities of interest, this approach is not used. Beyond this, however, the results of the screening study are useful for better understanding the effect of the joint parameter variations on the dam's response and to validate the results of more simplified methods, such as the TD.

**Table 6.** *p*-values from ANOVA: summarizing the significant parameters for the base and neck sliding of the dam.


† Friction and cohesion, ‡ Friction only.

In the same way as for the multiway ANOVA, for each response monitored, Sobol's indices were calculated using UQLab [50] through MATLAB. Total sensitivity indices, which measure the main effects of a given parameter and all the interactions (of any order) involving that parameter, as well as the first order indices (no interaction), were estimated as shown in Figure 12. It is observed from Figure 12a that, for the concrete-rock joint, the parameters which had the greatest effect on the monitored response are RN, DEI, BFP, and BFR. Particularly, the most important parameters for PR are RN and DEI, DEI and BFR for SSFR, BCP for SSF, and DEI for OSF and USF. On the contrary, it is observed from Figure 12b that, for the concrete-concrete joint, the most important parameter is RN, followed by GLN and JCP. In general, the parameters affecting the most PR, OSF, and SSFR are RN and GLN, while RN, GLN, and JCP have the most influence on SSF. Finally, only RN affects USF.

**Figure 12.** Sobol indices: (**a**) Base and (**b**) Crest.

#### 5.3.2. Stability Analysis Results

The final system output, together with the global sensitivity analysis, can also be used to assess the dam performance through the formulation of safety recommendations. To this end, the simulation results for the base SSF and PR at the base and at the crest are plotted

against the two most influential parameters according to Figures 4 and 12. The simulation results are presented in Figure 13a–c, where it can be seen that the response of interest can be approximated well by a surface defined with only two model parameters, as shown in Table 7.


Finally, ranges of values that meet the SFs provided by safety guidelines are formulated by intersecting the parametric surfaces with horizontal planes at the target SFs specified in Table 1. Figure 13d presents the BCP-BFP zones that result in base SSF values lower than those prescribed by the CDA [9], while Figure 13e,f present the RN-DEI and RN-GLN zones that provide a resultant within the 1/3 and 1/2 median of the base for the base and the crest lift joints, respectively.

**Figure 13.** Stability analysis output: (**a**) SSF-Base, (**b**) PR-Base, (**c**) PR-Crest, (**d**) BCP-BFP regions, (**e**) RN-drain efficiency (DEI) regions and (**f**) RN-GLN regions.

#### **6. Conclusions**

Uncertainties prevail in the safety assessment of dams, particularly in the identification of failure modes, loading conditions and model parameter estimations. Hence, it is important to consider various sources of uncertainties for the safety assessment of dams, and the means to do so is through a probabilistic analysis. The main goal of this study was to develop a probabilistic-based methodology to sufficiently assess the safety of dams with a flexible sampling strategy so that new data can be easily and efficiently incorporated in the analysis without the re-evaluation of the system simulations. Moreover, a prescreening of the model parameter sensitivity was performed with the generation of TDs, which was later validated with variance-based global sensitivity analysis methods, such as ANOVA and Sobol's indices. Finally, safety recommendations were formulated by evaluating the system output, and the probability of the target SF exceeding the guidelines given a determined LC was estimated.

The proposed procedure is more robust, computationally efficient, and more easily interpretable than conventional methods while accounting for uncertainties in the resistance

and loading parameters, which would otherwise be neglected. The perspective taken is that dam safety assessment is a tool for providing insights that can strengthen both the engineering and decision aspects of dam safety management. As such, this study will allow professionals in the dam industry and dam owners to expedite the safety assessment of gravity dams and to identify the parameter uncertainties affecting the dam response the most so that economic resources can be invested in the exhaustive study of these parameters.

**Author Contributions:** Conceptualization, R.L.S. and B.M.; methodology, R.L.S.; validation, B.M., P.P. and Jamie E. Padgett; formal analysis, R.L.S.; resources, P.P. and B.M.; writing—original draft preparation, R.L.S.; writing—review and editing, P.P.; visualization, J.E.P.; supervision, P.P. and B.M.; project administration, P.P.; funding acquisition, P.P. All authors have read and agreed to the published version of the manuscript.

**Funding:** The authors acknowledge the financial support of MITACS, the Natural Sciences and Engineering Research Council of Canada (NSERC) and the Fonds de recherche du Quebec–Nature et technologies (FRQNT).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy.

**Acknowledgments:** The authors acknowledge the financial support of MITACS, the Natural Sciences and Engineering Research Council of Canada (NSERC) and the Fonds de recherche du Quebec–Nature et technologies (FRQNT). Any opinions, findings, and conclusions or recommendations expressed in this paper are those of the authors and do not necessarily reflect the views of the sponsors.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

#### **References**


## *Article* **Deep Neural Network and Polynomial Chaos Expansion-Based Surrogate Models for Sensitivity and Uncertainty Propagation: An Application to a Rockfill Dam**

**Gullnaz Shahzadi and Azzeddine Soulaïmani \***

Department of Mechanical Engineering, École de Technologie Supérieure, 1100 Notre-Dame W., Montréal, QC H3C 1K3, Canada; gullnaz.shahzadi.1@ens.etsmtl.ca

**\*** Correspondence: azzeddine.soulaimani@etsmtl.ca

**Abstract:** Computational modeling plays a significant role in the design of rockfill dams. Various constitutive soil parameters are used to design such models, which often involve high uncertainties due to the complex structure of rockfill dams comprising various zones of different soil parameters. This study performs an uncertainty analysis and a global sensitivity analysis to assess the effect of constitutive soil parameters on the behavior of a rockfill dam. A Finite Element code (Plaxis) is utilized for the structure analysis. A database of the computed displacements at inclinometers installed in the dam is generated and compared to in situ measurements. Surrogate models are significant tools for approximating the relationship between input soil parameters and displacements and thereby reducing the computational costs of parametric studies. Polynomial chaos expansion and deep neural networks are used to build surrogate models to compute the Sobol indices required to identify the impact of soil parameters on dam behavior.

**Keywords:** sensitivity analysis; polynomial chaos expansion; uncertainty; deep neural networks; rockfill dams

#### **1. Introduction**

To meet the new challenges faced by geotechnical engineers, the use of innovative computer-based models has been growing exponentially. The complex structures and uncertainties that comprise the design of rockfill dams are a major challenge for predicting dam behavior [1,2]. Numerical methods, computational statistics and machine learning play a significant role in building improved, reliable rockfill dam models, helping to predict their behavior and reduce the cost of construction. The use of sensitivity analysis has attracted the interest of engineers seeking to understand the complex behavior associated with soil parameters. The main rationale for a sensitivity analysis using Sobol indices is to identify the most significant parameters in the variability of the output response [3]. Sensitivity analysis methods are usually categorized into local and global sensitivity analyses [4]. Local sensitivity analysis quantifies the local impact of an input parameter on a model, whereas global sensitivity analysis is focused on the uncertainty in the output due to the uncertainty in the input [5]. Numerous techniques have been developed for obtaining Sobol indices through variants of the Monte Carlo sampling technique [6] and variance-based global sensitivity analysis are performed to identify the parameters that most affect the dam stability [7], although these techniques for sensitivity analysis often require a large number of simulations [8]. The surrogate-based methods are the type more widely used, due to their efficiency and cost savings [9–13]. Polynomial chaos expansion based surrogate models have recently been used for the sensitivity analysis of dams [14].

This work evaluates surrogate-based and variance-based global sensitivity analyses in the design of a rockfill dam. Finite element method models (FEM) with appropriate soil parameters are often utilized for dam modeling and design [15–17]. Various constitutive models exist, each involving a different set of parameters, tested on and used for

**Citation:** Shahzadi, G.; Soulaïmani, A. Deep Neural Network and Polynomial Chaos Expansion-Based Surrogate Models for Sensitivity and Uncertainty Propagation: An Application to a Rockfill Dam. *Water* **2021**, *13*, 1830. https://doi.org/ 10.3390/w13131830

Academic Editors: M. Amin Hariri-Ardebili, Fernando Salazar, Farhad Pourkamali-Anaraki, Guido Mazzà and Juan Mata

Received: 29 April 2021 Accepted: 27 June 2021 Published: 30 June 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

several geotechnical problems [18]. In this study, a two-dimensional plane–strain finite element-based model is used in Plaxis to compute the displacements and stresses for a vertical cross-section of the dam, which employs a simple constitutive soil model, the Mohr– Coulomb (MC) model [19]. The soil parameters cohesion(*C*), specific weight (*ρ*), shear modulus (*Gre f*), Poisson coefficient (*ν*) and friction angle (*φ*) are the input parameters for the MC model [20]. Moreover, the Mohr–Coulomb constitutive model is widely used in geotechnical engineering practice due to its simple nature, and fewer parameters are required as compared to other more complex constitutive models such as the Hardening Soil model (HS) [21]. The Sobol sampling method is applied to generate the samples of soil parameters as the input [22,23]. Subsequently, the parameters are assigned to the numerical model and the displacements are calculated at the positions of each of the inclinometers. Once the database of the inputs and outputs has been produced, the dam response can be estimated with respect to the uncertainty associated with the input parameters. The polynomial chaos expansion (PCE) and deep neural network (DNN) techniques [24–27] are used to build the surrogate models to evaluate the Sobol indices. The surrogate models are trained by utilizing an error function that measures the difference between the computed and measured displacements on the inclinometers.

#### **2. Methodology**

The methodology is comprised of two main phases: surrogate model approximation and sensitivity–uncertainty analysis.

#### *2.1. Surrogate Models*

In the current challenging and technically competitive environment, surrogate models can increase efficiency and reduce the computational costs of a problem or design process. Several surrogate-modeling techniques have been applied to uncertainty analysis, sensitivity analysis, and optimization. Polynomial chaos, a probabilistic approach, and deep neural networks are used in this study.

#### 2.1.1. Polynomial Chaos Expansion (PCE)

Consider a physical model represented by a function *<sup>y</sup>* <sup>=</sup> *<sup>M</sup>*(*x*), where *<sup>x</sup>* ∈ <*<sup>n</sup>* , *<sup>y</sup>* ∈ <*m*, and *n* is the number of input quantities and *m* the number of outputs. For simplicity, the *m* = 1 case will be considered in the following description. The uncertainties in the input variables and their propagation to the output lead to the description of *x* and *y* as random variables *X* = (*X*1, *X*2, *X*3, ..., *Xn*) and *Y*, respectively [28–30]. For a specific value of *x*, the corresponding response (a realization) *y* is actually computed by executing a deterministic numerical solver for the non-intrusive variant of PCE. The joint probability density function (PDF) of the random vector *X* is denoted by *ρx*. Assuming that the input random variables *X<sup>i</sup>* are independent, then *ρ<sup>x</sup>* is a multiplication of the marginal probabilities, *ρx*(*X*) = ∏ *n i*=1 *ρi*(*Xi*). A polynomial Chaos Expansion approximates the response *Y* as a linear combination of orthonormal polynomials *ϕα*(*X*):

$$\tilde{Y}(X) = \sum\_{a=1}^{NP} b\_a \varphi\_a(X),\tag{1}$$

where *b<sup>α</sup>* are the expansion coefficients forming the vector *b* = (*b*1, *b*2, *b*3, ..., *bNP*) *T* . In a full PCE, the number of expansion factors *NP* depends on the polynomial order *p* and the number of random input parameters *n*, and is given by *NP* = (*n*+*p*)! *p*! *n*! . The multivariate basis of polynomials *ϕα*(*X*) can be constructed as a tensor product of univariate orthonormal polynomials *ϕ<sup>p</sup> α i* (*X*), that is, *ϕα*(*X*) = ∏ *n i*=1 *ϕp α i* (*Xi*), where *p α i* (*i* = 1, ..., *n*) is a multi-index vector. The optimal choice of the univariate polynomial basis function is closely related to the probability density functions *ρi*(*Xi*) [29]. For instance, Legendre polynomials serve as an optimal basis function for uniform distributions. The polynomial chaos expansion coefficients *b<sup>α</sup>* can be computed in a non-intrusive and affordable way using a regression approach. A dataset *D* is composed of *N* input vectors *X<sup>D</sup>* = (*x* (1) *D* , *x* (2) *D* , ..., *x* (*N*) *D* ) *T* sampled from the PDF *ρx*, and their corresponding responses are put in a vector *Y<sup>D</sup>* = (*y* (1) *D* , *y* (2) *D* , ..., *y* (*N*) *D* ) *T* , with *y* (*i*) *<sup>D</sup>* = *M*(*x* (*i*) *D* ). The expansion coefficients with a regularization term can be obtained by minimizing the error ∑ *N i*=1 (*y* (*i*) *<sup>D</sup>* <sup>−</sup> *<sup>Y</sup>*¯(*<sup>x</sup>* (*i*) *D* ))<sup>2</sup> + *λPb T b*. Defining Φ as the design matrix whose components are *ϕj*(*x* (*i*) *D* )(*i* = 1, ..., *N*; *j* = 1, ..., *NP*), the expansion coefficients vector is then given as the solution of the ordinary least-squares system:

$$b = (\Phi^T \Phi + \lambda\_P I)^{-1} \Phi^T Y\_{D'} \tag{2}$$

where *λ<sup>P</sup>* is a regularization parameter and I is the identity matrix. The number of sample points is defined as *N* = *γ NP*, and *γ* ≥ 1 is an oversampling parameter used to control the accuracy of the PCE [31,32]. The sample input vectors can be generated using efficient sampling algorithms such as the Latin hypercube sampling algorithm (LHS) or the Sobol scheme [22,23,33]. Once the expansion coefficients are computed, the polynomial expansion defined in Equation (1) can be used to predict the approximate response for any input variable (within the learning domain). For instance, the mean and the variance of the response can be computed using the basis function orthonormality property [32]. Their expressions are given by:

$$\mu\_D = \int \Psi \rho\_\mathbf{x} d\mathbf{X} = \int (\sum\_{a=1}^{NP} b\_a \rho\_\mathbf{a}(\mathbf{X})) \rho\_\mathbf{x} d\mathbf{X} = b\_{1\prime} \tag{3}$$

and

$$
\sigma\_D^2 = \int (\bar{Y} - \mu\_D)^2 \rho\_\mathbf{x} dX = \sum\_{a=2}^{NP} b\_a^2. \tag{4}
$$

**Remark 1.** *The input variables are assumed to be independent in the above approach. However, it is possible to use the Rosenblatt transformation [34] to formulate the problem as a function of auxiliary independent variables.*

#### 2.1.2. Deep Neural Networks

Deep neural networks (DNN) are widely considered to be a powerful and general numerical approach to building a nonlinear mapping between a set of inputs (features) and their corresponding outputs (labels or targets). Deep neural networks are well known in data science, with various applications in science and engineering. In the PCE approach, the surrogate model is comprised of linear combinations of fixed basis functions. Such models have useful practical applications, but they may be limited by the curse of dimensionality for large datasets. It should be mentioned that much effort has been invested in reducing the severity of the curse of dimensionality by using sparse expansions [35]. Furthermore, in order to apply such models to large-scale problems, the basis functions must be adapted to the data. There is a large body of literature on deep networks [25–27], and a brief description is given next. Deep neural networks use parametric forms for basis functions, in which parameter values are adapted during training. Moreover, with respect to these parameters, the model is nonlinear as it uses nonlinear activation functions. Figure 1 illustrates a DNN with one hidden layer. The input data are mapped to the hidden layer (1) to compute

$$h\_j^{(1)} = f(\sum\_{i=1}^n \mathcal{W}\_{ji}^{(1)} x\_i + a\_j^{(1)}),\tag{5}$$

which are then fed to the output layer (*o*) to compute the response

$$y\_k = g(\sum\_{j=1} W\_{kj}^{(0)} h\_j^{(1)} + a\_k^{(0)}),\tag{6}$$

where *f* and *g* are activation functions, *W* (1) *ji* , *W* (0) *kj* are the weight parameters and *a* (1) *j* , *a* (0) *k* are the bias parameters.

**Figure 1.** One-layer neural network.

The number of neurons in the input layer is the number of input features *n*, and *m* is the dimension of the neural network response vector *YNN*. The number of hidden layers in a deep neural network and the number of neurons in each hidden layer are hyperparameters, which are optimized by experimentation guided by monitoring validation and test errors. To determine the weights and bias parameters, the network is trained on the dataset by minimizing the loss (error) function. As described earlier, a dataset *D* is composed of *N* input vectors *X<sup>D</sup>* = (*x* (1) *D* , *x* (2) *D* , ..., *x* (*N*) *D* ) *T* , which are sampled from the PDF, and of the corresponding targets, which are put in a vector *Y<sup>D</sup>* = (*y* (1) *D* , *y* (2) *D* , ..., *y* (*N*) *D* ) *T* with *y* (*i*) *<sup>D</sup>* = *M*(*x* (*i*) *D* ). In regression problems, the mean square error (MSE), also called the loss function, between the model outputs and the labels (targets), is used along with a regularization term:

$$J = \frac{1}{N} \sum\_{i=1}^{N} \left\{ \frac{1}{2} \sum\_{k=1}^{m} \left[ y\_k^{(i)} - y\_{D,k}^{(i)} \right]^2 \right\} + \lambda \sum\_{l,a,\emptyset} (W\_{a\emptyset}^{(l)})^2 \,\prime \tag{7}$$

where *λ* is a regularization hyperparameter. An iterative approach based on the backpropagation algorithm is used to minimize the loss function. The activation function *f* is usually the sigmoid or the rectified linear unit, while *g* is the identity function for our regression problem. An example of a deep network is presented in Figure 2, where five hidden layers are used; the input layer has *n* = 5 input parameters, and the output layer has *m* = 64 responses (*YNN* = (*y*1, *y*2, ..., *y*64) *T* ).

It can be shown that minimizing the error function *E<sup>D</sup>* in Equation (8) is equivalent to minimizing the negative log of the likelihood function, under an assumed Gaussian distribution noise in the targets, with an assumed constant variance *σ* 2 *D* .

$$E\_D = \frac{1}{N} \sum\_{i=1}^{N} \left\{ \frac{1}{2} \sum\_{k=1}^{m} \left[ y\_k^{(i)} - y\_{D,k}^{(i)} \right]^2 \right\} = \frac{1}{2N} \sum\_{i=1}^{N} (Y\_D - Y\_{NN})^2. \tag{8}$$

Moreover, maximizing the log-likelihood with respect to the noise variance gives the solution *σ* 2 *<sup>D</sup>*,*ML* <sup>=</sup> <sup>1</sup> *<sup>N</sup>* ∑ *N i*=1 (*Y<sup>D</sup>* − *YNN*) 2 . Therefore, the prediction of the network for a given input parameter vector *X* is given by a Gaussian probability distribution with a mean *Y*¯(*X*) = *YNN* and a variance *σ* 2 *<sup>D</sup>*,*ML*, which represents the noise in the data. There are many public domain implementations of (standard) deep neural networks, such as the TensorFlow library [36]. In this work, the Matlab deep learning neural toolbox is used [37].

#### 2.1.3. Ensemble of Models

In machine learning, ensembling is a technique used to improve the predictive performance and reduce the generalization error by training several models separately and subsequently combining their solutions [24–27]. The idea here is that the ensemble (i.e., averaged solution) will perform at least as well as any of its members. Given a dataset, different neural network solutions can be obtained by varying the numbers of layers, the number of neurons for each layer, the training algorithm, the hyperparameters, and so forth. A simple and efficient approach is to use several random initializations of the weights. This option has proven to be efficient enough to generate an ensemble with partially independent members [38]. Given a mixture of *K* trained neural networks, each member outputs a solution with a mean *Y* (*k*) *NN* and a variance *σ* (*k*) *<sup>D</sup>*,*ML*, an averaged single normal mean distribution can be defined with a mean *Y*¯(*X*) = *Y ens NN*, where:

$$Y\_{NN}^{\text{ens}} = \frac{1}{K} \sum\_{k=1}^{K} Y\_{NN}^{(k)} \tag{9}$$

and a variance given by:

$$
\sigma\_{\rm ens}^2 = \frac{1}{K} \sum\_{k=1}^K \left\{ (\sigma\_{D,ML}^{(k)})^2 + (Y\_{NN}^{(k)})^2 \right\} - (Y\_{NN}^{\rm ens})^2. \tag{10}
$$

*K* is typically taken between five and 12 (in the following numerical results, it is assumed to be equal to ten). Therefore, the numerical prediction of the network is represented by a Gaussian with the mean *Y*¯(*X*) and the variance *σ* 2 *ens*, which represents uncertainties in both the data and in the weights.

#### *2.2. Global Sensitivity Analysis*

Sensitivity analysis provides a means of determining the effects of variations of input parameters on the outputs of a model. If a small change in input parameters results in a relatively significant difference in the output, then the parameter is considered significant for the model. In a global sensitivity analysis, all the inputs are varied simultaneously over their range, and are usually considered independent. The fundamental steps constituting the global sensitivity analysis technique are: (i) specification of the computational model; (ii) determination of relevant inputs and their bounds; (iii) input sample generation by a sampling design method; (iv) evaluation utilizing the generated input parameters; and (v) uncertainty analysis and calculation of the relative importance of each input through a sensitivity estimator. For more mathematical details, see [39] and references therein. The code described in [39] is also used for the present case study.

#### **3. Case Study: Application to Romaine-2 Dam**

A real rockfill dam was selected for a case study in order to illustrate the application of the surrogate modeling methodology for a global sensitivity analysis and an uncertainty analysis. Figure 3a illustrates a 2D cross-section of the Romaine-2 dam built in Quebec (Canada) [40,41]. The dam is 112 m high, and has an asphalt core and is grouted on a rock foundation. The asphalt core is surrounded by crushed stones having a maximum size of 80 mm, which act as supports. The transition zone (*N*) lies next to the support region (*M*), composed of crushed stones having a maximum size of 200 mm. Moreover, the particles with a maximum size of 600 mm are used in the inner shell zone (*O*) and in the outer region (*P*), composed of rocks with a maximum size of 1200 mm. Two vertical inclinometers named INV1 and INV2 are installed at two different positions (see Figure 3a) to measure the vertical displacements, considered the measured data in this study. Using the plane strain hypothesis, a finite element of the dam structure was built using the commercial code Plaxis [42]. A mesh of (2187) triangular elements with 15 nodes each is presented in Figure 3b, where the different soil sub-domains are meshed accordingly, and more refinement is used around the asphalt core. A mesh convergence study [43] showed that the mesh is fine enough. To simplify the study, the Mohr–Coulomb (*MC*) constitutive law was used, given that the dam was heavily compacted during construction [40]. Indeed, a detailed numerical study [43] showed that the discrepancies between the MC results and those obtained with the more sophisticated Hardening Soil model [21] for this rockfill dam are not significant. A dataset *D* was built using Sobol's sampling algorithm to generate *N* sets of *n* = 5 physical parameters related to the subdomain (*P*). The parameters include the cohesion (*C*), specific weight (*ρ*), shear modulus (*Gre f*), Poisson coefficient (*ν*) and the friction angle (*φ*). For a sample (*i*), the input vector is then *x* (*i*) *<sup>D</sup>* = (*C* (*i*) , *ρ* (*i*) , *G* (*i*) *re f* , *ν* (*i*) , *φ* (*i*) ) *T* . The parameters are supposed to follow a uniform distribution. Several types of distributions could be utilized if more data are available to generate the sample set of soil parameters. The dilatancy angle is set relative to the friction angle as *ψ* = *φ* − 30 (in degrees). Only the parameter variations in zone (*P*) are considered in this study, as this domain covers the maximum portion of the dam. Ideally, all sub-domain parameters could be included, but for the sake of illustration, only zone (*P*) is considered, as it is the most significant. The displacement fields corresponding to *N* sets of inputs *x* (*i*) *D* are obtained by running Plaxis [42]. The displacements on a number of points (32 in this case) on each inclinometer are extracted, yielding a response vector *Y* (*i*) *D* of dimension *m* = 64.

Table 1 presents the parameter interval of variations of zone *P* and the parameter values of zones *N*, *O* and *M*. The parameter estimates in Table 1 are based on a previous study conducted in [40,43].


**Table 1.** Soil parameter values or intervals of variations for zones P, N, O and M.

(**a**) Cross-section of the Romaine-2 dam in Plaxis. Different zones of the dam are highlighted in alphabets. The vertical inclinometers are denoted by INV1 and INV2.

(**b**) Mesh used for the computational domain.

#### **Figure 3.** Romaine-2 dam.

#### *3.1. Sample Size Convergence Study*

The Sobol sampling technique [44] was used to generate the samples by varying their size *N* (*N* = 12, 48, 96, 156, 204, 252, 300, 348, 392, 444, 496, 512, 600, 720, 840, 900, 1080, 1500 and 3000). The corresponding numerical simulations were performed using Plaxis, which required 587 CPU hours on an Intel-i7 PC, for *N* = 3000. To build confidence in the generated database, a convergence study with respect to *N* was performed for the standard deviation of the vertical displacement at the 64 measurement points on the inclinometers. To check the convergence for this statistical study, standard deviation plots were built for the sample size at three positions on each inclinometer: at the top, middle and bottom (see Figure 4). The standard deviations show some fluctuations as the sample size is increased up to 1080; however, between sample sizes 1080 and 3000, the standard deviation is close to constant (up to 1% of variation), which implies that sample size 1080 is sufficient for subsequent sensitivity studies.

The confidence intervals for the displacements (mean ±2 standard deviation) obtained by using this classical statistical analysis (which is in fact a Monte Carlo simulation (MCS)) are shown in Figure 5. The measured data for each inclinometer are also represented in this figure, revealing fluctuations that can be attributed to some external effects such as the installation process, calibrations, temperature variations and human factors, which may have influenced some probes in the inclinometers. At the bottom, where the displacements should be zero, there is instead a 2.5 cm displacement. Therefore, the uncertainty in the measured displacement is estimated to be at least ±2.5 cm. Figure 5 shows that, considering the uncertainties, the measured data are mostly within the predicted numerical confidence

intervals, especially when the displacements are more significant. The statistical confidence intervals could be enlarged by changing the distribution intervals of the input parameters. Indeed, we used a priori uniform distributions on estimated input intervals [40].

(**b**) Middle section of the dam. (**c**) Bottom section of the dam.

**Figure 4.** Variations of standard deviation (of the vertical displacement) with respect to the sample size for each inclinometer. The plots are built for the nodes close to the top, middle and bottom sections of the dam.

#### *3.2. Sobol Indices*

A Sobol index is defined as the ratio of partial variances to the total variance, and reflects the relative importance of each input parameter [45], as shown in Figure 6 for points located at the top, middle and bottom of the inclinometers. The indices here range from 0 to 1. It is evident from Figure 6 that the shear modulus is the dominant parameter, with a contribution of 44% to 71% in the top sections of the dam, and that it diminishes gradually with the depth. The Poisson's coefficient is the second most significant parameter, with a smaller effect (24%) on top, and a high impact (84%) close to the foundation. At 140 m is the foundation (made up of routed rocks) of the dam, therefore the impact of soil parameters is abrupt at the bottom.

The first-order indices are calculated along the inclinometers, as shown in Figure 7. As stated earlier, for both inclinometers, the shear modulus is dominant in the upper section of the dam. The Poisson's coefficient is another crucial parameter influencing the dam's behavior. While it is less influential at the top section, its impact increases as we head towards the bottom part. The specific weight only affects the lower section; thus, the shear modulus and Poisson coefficients are the most significant parameters, although their contributions vary with the elevation.

**Figure 7.** First Sobol variations with respect to the elevation.

#### *3.3. Surrogate Modeling*

Surrogate modeling is an approach aimed at generating an approximate numerical model to reduce the computing time, especially when a large number of simulations are required, as is the case in uncertainty and sensitivity analysis. Instead of using the 'fullorder' original finite element model, an approximate one called a 'surrogate model' (or surface response) is built using the input–output database. Many techniques could be used, but here we consider polynomial chaos expansions and deep neural networks. Based on the convergence study in Section 3.1, the *N* = 1080 datasets is accurate enough to build the surrogate models. To assess the accuracy of these models, we examine the residual errors (the root mean square error (RMSE) and the coefficient of determination (*R* 2 )).

#### 3.3.1. Polynomial Chaos Expansion (PCE)

A polynomial chaos expansion-based method [46] is a probabilistic technique that can be used to build an accurate surrogate model. The degree of the polynomials and the regularization parameters are tuned to get the best results. The PCE degree is varied from 2 to 6, and the regularization parameter *λ<sup>P</sup>* is taken as 0.001, 0.01 and 0.1, respectively.

The mean and standard deviation are calculated using the surrogate model obtained by running a simple Monte Carlo method on the PCE. The evaluation of the absolute mean error with respect to the polynomial order and the regularization parameter for an output response is shown in Figure 8, and is defined as:

$$E\_1 = \frac{1}{m} \sum\_{i=1}^{m} ||Y\_{mp}^i - Y\_{ms}^i|| \,\tag{11}$$

where *m* is the number of nodes and *Ymp* denotes the mean of predicted displacement at the same node as *Yms*, the simulated displacements. Ideally, *λ<sup>P</sup>* is selected as the smallest value which avoids overfitting. Figure 8 shows that, for 0.001, 0.01 and 0.1, the value *E*<sup>1</sup> decreases with the polynomial degree for both inclinometers. Therefore, the results for *P* = 6 and *λ<sup>P</sup>* = 0.001 are considered the most reliable.

Figure 9 shows that the measured and predicted displacements obtained using PCE trained for datasets *N* = 300 and *N* = 1080 are in good agreement. Moreover, when considering the measurements along with their uncertainties, we see that they are mostly within the predicted numerical confidence intervals of PCE, especially when the displacements are more significant. The first-order indices along the inclinometers by using PCE are shown in Figure 10. As stated earlier, for both inclinometers, the shear modulus is dominant in the upper section of the dam. The Poisson's coefficient is another crucial parameter influencing the dam's behavior. While it is less influential at the top section, its impact increases as we head towards the bottom part. The specific weight only affects the lower section. Thus, the shear modulus and Poisson coefficients are the most significant parameters, although their contributions vary with the elevation. The Sobol indices at the top, middle and bottom of the dam are also recomputed based on the PCE surrogate model, as shown in Figures 10 and 11 , which illustrates almost the same information and conclusions as those shown in Figures 6 and 7.

**Figure 8.** Absolute mean error for degree and regularization parameters.

**Figure 10.** The pie charts show the sensitivity indices based on PCE for INV1 and INV2 vertical displacements, respectively.

The shear modulus is the dominant parameter, with a contribution of 50% to 70% in the top sections of the dam, and whose influence diminishes gradually with the depth. The Poisson coefficient is the second most significant parameter, with a smaller effect (18%) on top and a high impact (90%) close to the foundation.

#### 3.3.2. Deep Neural Network Results

In order to fit the data, a MATLAB function 'Neural Net Fitting' is used with a fivelayer feedforward network, as shown in Figure 2. A scaled conjugate gradient algorithm was used for the training. The (*N* = 1080 and *N* = 300) datasets were divided into training, validation and testing subsets, in the following proportions: 70%, 15%, and 15% respectively. An ensemble of ten trained networks was created by randomly initializing

the weights in the training, and the outputs were predicted individually and averaged to obtain an ensemble output solution. An example of plots for datasets *N* = 300 and *N* = 1080, showing the fitness variation with respect to the training iterations (epochs), is presented in Figure 12.

The mean and standard deviation are calculated using the surrogate model obtained by running a simple Monte Carlo method on the ensemble neural network model. The mean and variance for the ensemble model are computed by Equations (9) and (10).

The displacements obtained with the ensemble neural network are shown in Figure 13, and are very similar to those obtained with the statistical approach in Figure 5, and are represented in the pie charts and the indices as shown in Figures 14 and 15, respectively. The displacement standard deviations are calculated on the inclinometers using the statistical approach (MCS) and the PCE surrogate models and DNN models are reported in Table 2, with a maximum standard deviation for all methods close to 4 centimeters. Moreover, near the foundation of the dam the displacements are almost zero.

**Figure 11.** First Sobol index results obtained using PCE surrogate model.

(**a**) Convergence of the fitness function for *N* = 1080. (**b**) Convergence of the fitness function for *N* = 300.

**Figure 12.** Performance of NN.


**Table 2.** Comparative study of standard deviation in *m* by numerical simulations and surrogate models for top, middle and bottom sections of the dam.

(**a**) Vertical displacements for inclinometer INV1. (**b**) Vertical displacements for inclinometer INV2.

**Figure 13.** Confidence intervals using an ensemble of neural networks-based.

**Figure 14.** *Cont*.

**Figure 14.** The pie charts show the sensitivity indices based on DNN for INV1 and INV2 displacements, respectively.

**Figure 15.** First Sobol index results obtained using ensemble NN.

Figure 16 shows the computational efficiency for the cpu for one Plaxis realization and for the surrogate models with respect to the number of samples for the soil parameters. It can be observed that the surrogate models are more efficient at predicting the results as compared to obtaining the simulations with the FEM model. It is worth noting that this result will be helpful for an upcoming study that consists of the identification of soil parameters by inverse analysis. In the inverse analysis, the optimization algorithm makes hundreds of calls to obtain the numerical solutions [47]. Therefore, the surrogate models will be used instead of the full-order original finite element model for computational efficiency. The outcome of this study is that, indeed, NN requires many fewer samples to realize a sensitivity or identification analysis compared to the full-order model.

**Figure 16.** Computational efficiency for the displacements obtained by FEM, PCE and NN.

#### **4. Conclusions**

This paper contributes to the sensitivity and uncertainty analysis for rockfill dams using the surrogate modeling approach. The approach was applied to a real rockfill dam with an asphalt core. Two surrogate models were developed, namely, a polynomial chaos expansion (PCE) model and a deep neural network (DNN) by training two datasets *N* = 300 and *N* = 1080. Their results were compared to those obtained with Monte Carlo simulations. The variance-based sensitivity analysis reinforces the fact that the shear modulus and the Poisson coefficient are the parameters that play the most significant role in the dam's behavior. Therefore, when considering all material sub-domains, these two parameters may be kept as the only significant uncertain parameters, thereby significantly reducing the total number of uncertain inputs. A second analysis was conducted by sampling the input parameters using a uniform probability distribution. Overall, this study shows that building surrogate models reduces the computational cost of numerical models when a large number of simulations is required, as in sensitivity and uncertainty analyses.

**Author Contributions:** Data curation, G.S. and A.S.; Formal analysis, G.S. and A.S; Funding acquisition, A.S.; Investigation, G.S.; Methodology, G.S. and A.S.; Project administration, A.S.; Resources, A.S.; Software, G.S.; Supervision, A.S.; Validation, G.S.; Visualization, G.S.; Writing—original draft, G.S.; Writing—review & editing, G.S. and A.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was supported by the the Natural Sciences and Engineering Research Council of Canada and Hydro Québec, Canada. Their financial support is gratefully acknowledged.

**Data Availability Statement:** The data presented in this study is available on request from the corresponding author.

**Acknowledgments:** Not applicable.

**Conflicts of Interest:** The authors declare that they have no conflict of interest.

#### **References**


**Fernando Salazar \* , André Conde , Joaquín Irazábal and David J. Vicente**

International Centre for Numerical Methods in Engineering (CIMNE), Universitat Politècnica de Catalunya, 08034 Barcelona, Spain; aconde@cimne.upc.edu (A.C.); jirazabal@cimne.upc.edu (J.I.); djvicente@cimne.upc.edu (D.J.V.)

**\*** Correspondence: fsalazar@cimne.upc.edu

**Abstract:** Dam safety assessment is typically made by comparison between the outcome of some predictive model and measured monitoring data. This is done separately for each response variable, and the results are later interpreted before decision making. In this work, three approaches based on machine learning classifiers are evaluated for the joint analysis of a set of monitoring variables: multiclass, two-class and one-class classification. Support vector machines are applied to all prediction tasks, and random forest is also used for multi-class and two-class. The results show high accuracy for multi-class classification, although the approach has limitations for practical use. The performance in two-class classification is strongly dependent on the features of the anomalies to detect and their similarity to those used for model fitting. The one-class classification model based on support vector machines showed high prediction accuracy, while avoiding the need for correctly selecting and modelling the potential anomalies. A criterion for anomaly detection based on model predictions is defined, which results in a decrease in the misclassification rate. The possibilities and limitations of all three approaches for practical use are discussed.

**Keywords:** anomaly detection; machine learning; support vector machines; random forest; one-class classification

#### **1. Introduction**

Dams are an essential element in our way of living, since they provide fundamental services to our society, including drinking water, irrigation, navigation, flood protection, and recreation. In addition, they are a decisive element in hydroelectric generation schemes. According to the International Commission on Large Dams (ICOLD), there are around 60,000 large dams in operation worldwide, 6100 of which are in Europe [1]. Many of them were built decades ago and are close to, or even exceeded, their service life. This results in an increasing relevance of predictive maintenance and safety assessment of dams, as was highlighted in a recent report published by the United Nations University [2]. Similar figures were also reported in the USA [3].

Dam failures are rare, but safe dam operation requires significant resources for monitoring and repair. In this context, the early detection of anomalies allows increasing the effectiveness of investments in maintenance and, therefore, reduces the cost of operation.

The conventional approach to anomaly detection involves the use of some predictive model to estimate the dam response under a given combination of loads. Models based on the finite element method (FEM) can be used for such a purpose, once properly calibrated. Nonetheless, there is a tendency towards the use of machine-learning (ML) models, which are solely based on monitoring data [4,5].

In both cases, a set of monitoring devices is typically selected, and the measurements are compared to the predictions of the model. This is done separately for each response variable, then results are interpreted together with the knowledge about the dam properties, past behaviour, and other relevant information. In case some deviation is detected between

**Citation:** Salazar, F.; Conde, A.; Irazábal, J.; Vicente, D.J. Anomaly Detection in Dam Behaviour with Machine Learning Classification Models. *Water* **2021**, *13*, 2387. https://doi.org/10.3390/ w13172387

Academic Editor: Anargiros I. Delis

Received: 22 June 2021 Accepted: 25 August 2021 Published: 30 August 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

the expected response and the observed behaviour, engineering judgement is employed to make decisions regarding dam safety. In particular, the comparison shall be interpreted to identify the probable origin of the observed deviations, which requires an additional effort.

In this work, ML is applied to jointly analyse the records of a set of relevant monitoring devices and to associate them either to normal operation or to some anomaly scenario. This approach has two potential benefits:


In spite of the increasing interest of the community in applying ML methods in dam safety, the joint analysis has been much less explored. Mata et al. [6] applied linear discriminant analysis (LDA) to classify a group of observations into two classes: normal operation and potential failure scenario. They used DEM/FEM to generate the data corresponding to both situations. In a previous work, we used random forests (RF) as classifiers to associate a set of records to six potential scenarios (normal and five different potential anomalies) [7]. Although the results showed the potential of such an approach, a relevant drawback was also highlighted: anomaly scenarios need to be simulated with accuracy to generate the training set. This raises doubts on the capability for anomaly detection when the actual behaviour is not considered among the simulated scenarios. This relevant issue, specially from a practical viewpoint, is addressed in this work: a methodology is proposed for detecting unforeseen anomalies, i.e., scenarios which were not used for training the ML classifier. A similar approach was applied by Fischer et al. for detecting internal erosion in earth dams and levees, based on experimental laboratory data [8,9]. We further explore the possibilities of such approach for anomaly detection in arch dams, with the addition of the following elements:


The rest of the paper is organised as follows: the methods used are introduced in Section 2, including the FEM model used for generating the database, the ML algorithms and their calibration; results are presented and described in Section 3: model calibration, performance analysis, exploration of errors and evaluation on the validation set. Section 4 includes the conclusions and ideas for future research.

#### **2. Methods**

The overall workflow includes the following steps:


5. In view of the results for the test set, a new criterion for anomaly detection was defined, based on the model predictions, which was applied to the validation set.

The details of each step are described in the next subsections.

#### *2.1. Case Study*

The proposed methodology was applied to a Spanish double curvature arch dam with a height of 81 m above foundation and 20 cantilevers, with the material properties specified in Table 1. Five years of monitoring data were considered for this work (corresponding to the period from March 1999 to March 2004), which included the reservoir level and the air temperature, as well as the displacements at 28 monitoring stations corresponding to seven pendulums located as shown in Figure 1.

**Table 1.** Material properties.

**Figure 1.** Location of pendulums and cracks considered for each scenario. View from downstream.

#### *2.2. FEM Model*

For the construction of the 3D model, the designed mesh was formed by linear tetrahedra of variable size (Figure 2). A portion of the foundation was included in the 3D model with the conventional dimensions for structural analyses: foundation domain of two heights of the dam in depth, upstream and downstream directions and more than half the length of the dam on the left and right sides (Figure 3). The geometry was generated using a tool developed by the authors [10], which assists in creating the 3D model of arch dams from the geometrical definition of the arches and cantilevers. The mesh size in the dam body was chosen to ensure at least three elements along the radial direction, while the size of the elements of the foundation was increased gradually up to 25 m. This resulted in a mesh of 33,000 nodes forming 173,000 tetrahedra, generated with the GiD software [11].

**Figure 2.** Close view of the dam body and the mesh elements.

**Figure 3.** Overall view of the computational model for the dam body and foundation.

The final goal of this study is to identify behaviour patterns associated to certain structural anomalies in arch dams and, in particular, those due to crack openings. After a literature review, four categories of cracks frequently observed in arch dams were identified (Table 2). Two anomaly scenarios were defined for each category (Figure 1).



The cracks are considered in the FEM model by duplicating the faces of the corresponding elements and eliminating the tensile strength. This is basically equivalent to

using no-tension interface elements. The location and dimensions of the cracks introduced and the associated scenarios are shown in Figure 1.

Since the temperature field in the dam body influences the deformations of the dam and depends on the initial temperature considered, we performed a preliminary analysis to obtain a realistic thermal field to be used as the reference temperature in the body of the dam. This is a relevant issue, since thermal displacements are computed on the basis of the difference between these values and the thermal field at each time step of the simulation [16]. For this purpose, we performed a 12-year transitory analysis with a fixed value of the initial temperature (8 °C) and a time step of 12 h. The resulting thermal field at the end of this preliminary calculation was taken as the initial temperature for all the scenarios considered. A similar approach was used by Santillan et al. [17] and by the authors in previous studies [18].

A transient analysis was performed for a 5-year period on the Scenario 0 (normal operation, no crack opening). Since actual records for air temperature and reservoir level were applied, the results are realistic and can be considered representative of the actual behaviour of the dam. A one-way coupling between the thermal and the mechanical problem was applied: the thermal field at the end of the preliminary transient analysis was taken as reference temperature, i.e., deviations from such a value results in thermal deformations; the hydrostatic load is applied and the stress and deformation are computed assuming elastic behaviour; the deformation field is computed as the sum of the thermal and the mechanical deformations. The numerical implementation was developed by the authors and described in detail in [16].

The results of this model in terms of radial and tangential displacements at the location of the monitoring stations (see Figure 1) were extracted and compared to the actual measurements recorded. Figure 4 shows this comparison for three of the measuring stations. Results show that the simulated behaviour is representative of the actual evolution of dam displacements as a function of the variation of the thermal and hydrostatic loads.

**Figure 4.** Comparison between the observed radial displacements in stations 12, 16 and 26 and the results of the numerical model of Scenario 0.

Afterwards, seven FEM simulations were run for the same 5-year period on the modified models, correspondent to the anomaly scenarios defined. The tile plots in Figure 5 show the magnitude of the difference to Scenario 0: each tile corresponds to a monitoring variable and a particular scenario. The colour of the tile is a function of the median difference on the 5-year period between the records of the corresponding device for the scenario considered and those for Scenario 0, normalized with respect to the range of variation of the variable. Although this allows for comparison among devices and scenarios, the denormalized value (Figure 6) is also relevant, since deviations in variables with low

fluctuation may be of the same order of magnitude of the measuring error, thus hard to distinguish.

The plots show that Scenario 2a features the greatest deviation from normal operation. This is due to the nature of the anomaly: a crack opening in the dam heel. The combined effect of hydrostatic load and low temperatures generates tensile stresses in that area, which result in high displacements when the crack opens. The deviation from the reference case is greater for the lower station of the closest pendulum (Rad17), and decays progressively along such vertical (Rad18 to Rad21). The effect is similar, though lower, for the adjacent pendulum line (Rad12 to Rad16).

By contrast, the crack simulated in Scenario 2b, located in the downstream toe, has a minor effect on the records because such an area is compressed most of the time, thus the crack is closed and the behaviour is similar to the reference case.

The deviations in other scenarios are in general lower, with more impact on the tangential displacements in relative terms.

**Figure 5.** Median difference between anomaly scenarios and Scenario 0 for all tangential (**left**) and radial (**right**) displacements considered. Results are normalized to the range of variation of each input in Scenario 0.

**Figure 6.** Median difference between anomaly scenarios and Scenario 0 for all tangential (**left**) and radial (**right**) displacements considered. Colour scales differ as corresponds to the typical higher variation of radial displacements.

#### *2.3. Data Preparation*

As a result of the numerical calculations, a database is created including 8 scenarios: normal operation (Scenario 0) and 7 different anomalous behaviours (Scenarios 1a, 1b, 2a, 2b, 3a, 3b and 4). For each scenario, the database includes one record per day, corresponding to the actual recorded reservoir level and air temperature for the period 18 March 1999–15 March 2004, i.e., 1825 records per scenario.

This database reasonably approximates the dam response to the variation of thermal and mechanical loads in a realistic situation. However, the numerical model excludes the measuring errors which exist in actual devices. These errors were considered by adding a noise with normal distribution *N*(0, 0.1) to the simulated displacements.

Such data are divided into three subsets as a function of the date: the training set includes data for the period 18 March 1999–17 March 2002, the test set ranges from 18 March 2002 to 17 March 2003 and the validation set goes from 18 March 2003 to 15 March 2004.

#### *2.4. Classification Tasks*

#### 2.4.1. Multi-Class (MC) Classification

The conventional problem of supervised classification requires a training set with a set of inputs (also called features or predictors) and the corresponding labels. Those data are supplied to the algorithm, which learns the structure of the data and defines rules for assigning some classes to a set of inputs. In our case, the fitted model will be supplied with a set of monitoring records for a given load combination and will generate a prediction in terms of the scenario to which it corresponds. More precisely, the model differentiates between normal operation (Scenario 0) and each of the anomalies (other 7 scenarios).

In practice, ML classification models compute a probability of belonging to each of the classes defined during training for each set of input values. By default, the predicted class is that with the highest probability. However, the raw probabilities can be explored to draw more information regarding model predictions.

The prediction of this model corresponds to one of the 8 classes used for training. This approach has the advantage of distinguishing among different anomalies, but requires availability of samples corresponding to all possible situations, which need to be generated with numerical models. It is not clear if such a model would be useful in case some anomaly not included in the training set occurs.

#### 2.4.2. Two-Class (TC) Classification

To overcome such limitation, an alternative approach is proposed. Part of the anomalies considered were eliminated from the training set. As a result, models were fitted on a modified training set, which only includes Scenarios 0, 1a, 2a, 3a and 4. A new label was created with two classes: 0 for normal operation (former Scenario 0) and 1 for all other scenarios. To avoid the problem of imbalanced data [19], a random sample of records for anomalous scenarios was taken, so that this modified training set includes 1825 samples for class 0 and the same amount of records for class 1 (equally distributed among the original scenarios 1a, 2a, 3a and 4). The test set included both Scenario 0 and those anomalies not used for training (Scenarios 1b, 2b and 3b). Again, the class label was modified to include only two classes (0 and 1), as in the training set. This classification task is more challenging, since part of the test set corresponds to situations not used for training (Scenarios 1b, 2b and 3b). However, it is more realistic: anomalies in the test set may represent real scenarios, i.e., actual behaviour patterns not considered during model training.

#### 2.4.3. One-Class (OC) Classification

The third alternative explored makes use of the 'One-Class Classification' approach [20,21]. This technique was developed for problems in which the information available for training only corresponds to the normal operation. It is therefore applied for novelty detection. The training set in this case is limited to the samples corresponding to Scenario 0 within the original training set. The model fitted with this procedure is only capable of predicting two classes: that used for training and some other (it is thus useless to differentiate among different types of anomalies). This method was developed for cases in which information on the response of the system for abnormal operation is not available or is costly or impossible to obtain. That is the case in dam safety, and that was the limitation of previous approaches: in the best setting, some anomalies could be simulated, but they do not necessarily correspond to the behaviour patterns that may occur.

#### *2.5. Algorithms*

Machine learning (ML) problems can be classified into two main categories in accordance to the nature of the target variable: while in regression problems the goal is predicting the value of some numerical variable, in classification tasks the objective is assigning some label to a set of input values.

The vast majority of applications of statistical and ML methods to the analysis of dam monitoring data make use of the regression approach: some model is fitted to the available monitoring data with the aim of predicting some dam response such as the radial displacement at a given location within the dam body. Decisions regarding dam safety are made on the basis of the comparison between the model predictions and the observations.

By contrast, this work is based on classification: we define a set of response patterns, or classes, associated to the scenarios considered. They are provided to the model together

with the values of the monitoring variables. The objective of model fitting is identifying patterns in the input data useful to distinguish between classes. The output of the model is thus a categorical variable (label).

Many ML algorithms can be applied both to regression and classification tasks, though their capabilities and performance often vary. In this work, two of the most popular ML algorithms available for classification were considered as described in the next sections: random forests (RF) and support vector machines (SVM).

#### 2.5.1. Random Forests (RF)

RFs [22] are known to be appropriate for environments with many highly interrelated input variables [23]. Although the amount of samples in our database is relatively large, as compared to the number of inputs, these are highly correlated by nature (they have a strong association since they are linked in the numerical model).

This same algorithm was previously used in regression problems in different applications, e.g., to build regression models to predict dam behaviour [24], to interpret the response of dams to seismic loads [25] and to better understand the behaviour of labyrinth spillways [26]. Other fields of application in the water sector include dam safety [27], water quality [28], classification of water bodies [29] or urban flood mapping [30].

A random forest model is a group of classification trees, each of which is fitted on an altered version (a bootstrap sample) of the training set [31]. Since they were first proposed by Breiman [22], RFs have been used in multiple fields both for regression and classification tasks. The main ingredients of the algorithm can be summarized as follows:


This process includes randomness in two steps (in bootstrap sample generation, and in taking predictors at each split) with the aim of capturing as many patterns as possible from the training data.

One of the advantages of RF is the existence of the out-of-bag data (OOB), i.e., the part of the observations excluded from each bootstrap sample. The prediction accuracy for each observation can be computed from the trees grown on samples where such observation was not included. This can be considered as an implicit cross validation, which allows for obtaining a good estimate of the prediction error without the need to explicitly separate a subset of the available data.

Extensive application of this algorithm showed high prediction accuracy and robustness, i.e., the effect of the model parameters is low [31,32]. In addition, the algorithm performs implicit variable selection while fitting each tree, which simplifies pre-process [33].

As mentioned above, RF classifiers are robust in the sense that the model parameters typically have low influence on the results. Nonetheless, a calibration process was followed in this work based on the OOB error: all possible combinations of *mtry* (4, 6, 8 and 10), *ntree* (400, 600, 800 and 1000) and *nodesize* (1, 3, 5 and 7) were considered to fit RF models, and the prediction accuracy for the OOB data was assessed. The combination of parameters with the lowest error was chosen to fit the final RF model. The same procedure was followed for multi-class and two-class tasks.

#### 2.5.2. Support Vector Machines (SVM)

Although SVM can be applied to regression problems, the algorithm was originally created for classification [34]. The model fitting process not only aims at increasing classification accuracy on the training set, but also at maximizing the margin to improve separation of the classes [35]. This results in greater generalization capability. In addition, SVM is also among the most appropriate algorithms for one-class classification [20] and

has already been used for this purpose in the water field [21]. Applications of SVM both for regression and classification are numerous in different sectors. In hydraulics and hydrology, examples include pipe failure detection in water distribution networks [36], prediction of urban water demand [37] rainfall-runoff modelling [38], flood forecasting [39], as well as reliability analysis [40–42] and dam safety [4,5,8,9,43,44].

SVM make use of a non-linear transformation of the inputs into a high dimensional space, where a linear function is used for classification. The theoretical fundamentals of the algorithm are described in many publications (see, for instance [34,35,45]).

Since SVM models are more sensitive to the training parameters than RFs, calibration is more important than for RFs. Five-fold cross-validation (CV) was applied to the training set to obtain reliable estimates of prediction error and thus to select the best training parameters. In this work, we used radial basis kernels, defined as a function of two parameters: *C* (cost) and *γ*. For MC and TC, all possible combinations of *C* (0.1, 1, 10) and *γ* (0.001, 0.01, 0.1) were considered and the best combination from CV was later applied to fit the final model.

The process is similar for one-class classification (see Section 2.4), with the addition of the parameter *ν*, which controls the size of the margin between the class used for training and the outliers (anomalies in our case) [20]. We considered all possible combinations of *γ* (0.01, 0.04, 0.05, 0.06, 0.1), *C* (0.1, 1, 10) and *ν* (0.01, 0.025, 0.05, 0.075, 0.1). The results were evaluated in terms of the BA on a test set including both the anomalous situations in the training period and all the cases for the test period.

#### *2.6. Measures of Accuracy*

Henceforth, anomalies are considered as positive experiments (correctly predicted cracked cases are thus true positives, TP), while Scenario 0 corresponds to negative experiments (correct predictions for Scenario 0 are true negatives, TN). Consequently, false positives (FP) will be cases where the model predicted a crack on data from a crack-free case, and false negatives (FN) those when the model predicted no crack with data from a cracked case. In this work, the following measures of accuracy were considered:

$$\text{Sensitivity} = \frac{\text{TP}}{\text{TP} + \text{FN}} \tag{1}$$

$$\text{Specificity} = \frac{\text{TN}}{\text{TN} + \text{FP}} \tag{2}$$

Two error measures were used that take into account both false positives and false negatives: balanced accuracy (BA) is computed as the mean of sensitivity and specificity. In turn, the F1 score [46] also considers both, but more relevance is given to the false positives. This is in accordance to the nature of the phenomenon to be considered: in dam safety, overseeing an anomaly is more important than predicting a false crack.

$$\text{F1} = \frac{\text{2} \times \text{Precision} \times \text{Sensitivity}}{\text{Precision} + \text{Sensitivity}} \tag{3}$$

where:

$$\text{Precision} = \frac{\text{TP}}{\text{TP} + \text{FP}} \tag{4}$$

#### **3. Results and Discussion**

*3.1. Multi-Class Classification*

3.1.1. Calibration

Figure 7a shows the median of OOB class error for all combinations of parameters tested for RF models. It can be observed that the effect of the model parameters on the results is low. Nonetheless, we took the values from the best combination of those considered: *ntree* = 1000, *mtry* = 4, *nodesize* = 1. The same result of the calibration process is

shown for the SVM model also in Figure 7. In this case, the best performance was obtained with *C* = 10 and *γ* = 0.001.

**Figure 7.** Results of the calibration process. (**a**) RF model. The mean of the OOB error for all classes is plotted as a function of the number of trees (*ntree*), *nodesize* and the *mtry* parameter. (**b**) SVM model results for 5-folds cross validation.

#### 3.1.2. Evaluation

Although OOB error is often a good estimate for the generalization error, the RF model was evaluated using the test set, so that it can be compared to the SVM model. The confusion matrix is the main result, showing the predictions versus the real values. Tables 3 and 4 include the results both for the RF and the SVM model, in addition to the F1 and balanced accuracy for each class.


**Table 3.** Confusion matrix for multi-class classification for the RF model.

**Table 4.** Confusion matrix for multi-class classification for the SVM model.


The results of both algorithms show high accuracy in identifying all scenarios, being the performance of SVM model slightly better. This confirms the benefits of these techniques for supervised classification.

Both models show more accurate results than those obtained in a previous work based on RF [7], in which different anomalies were considered. This may be due to the calibration process, more detailed in this case, but also to the nature of the anomalies introduced. While they affected the mechanical boundary conditions in the former study, more realistic situations are considered here, representative of crack formation in different areas of the dam body. The effect of these modifications on the dam response have a more local effect, easier to identify by ML models.

The high accuracy demonstrates the soundness of the approach and the usefulness of the algorithms. However, it still has the limitation of the need for identifying and modelling the anomalies to be detected, which is highly relevant for its practical implementation.

#### *3.2. Two-Class Classification*

#### 3.2.1. Calibration

The same process was followed for calibration of both models for the case of two classes. The result is shown in Figure 8. As before, the combinations of parameters with best performance for the OOB error (RF) and the 5-folds cross-validation error (SVM) were later used for evaluation.

**Figure 8.** Result of the calibration process for the two-class classification task. (**a**) RF model. (**b**) SVM model.

#### 3.2.2. Evaluation

The evaluation of classification models for this task can be done in the first instance by means of the confusion matrix, as before. Table 5 shows the result for the RF model, which featured an F1 of 0.820 and a balanced accuracy of 0.846. In this case, there is a clear difference between classes. The model is highly accurate for identifying anomalies: the rate of false positives is 0.3%. This results in a specificity of 0.995. By contrast, the rate of false negatives is relatively high (48%), and thus sensitivity is lower (0.697).

**Table 5.** Confusion matrix for the RF model on the test set in the two-class classification problem.


The features of the training set need to be considered for the analysis of these results. The problem was posed in an unconventional manner, since samples labelled as anomalies in the training set (Class 1) are indeed different from those with the same label in the test set. They are both anomalous and different from Class 0, which corresponds to normal operation in both the training and the test sets, but they were computed from different numerical models. In conventional classification problems, classes defined in the training set are the same as in the evaluation or test sets. When the model is applied to a new set of

input values, these are classified according to their similarity to each of the classes. In this case, the test scenarios are in fact different from either of the two classes defined during training. The model determines which of the two classes is more closely related to the new input. The relatively high proportion of anomalous cases that the model considers as normal is therefore explained by the nature of the classification task. This can be further explored by separating the samples for class 1 into the original scenarios (Table 6). There is a clear difference among anomalies: accuracies or Scenarios 1b, 2b and 3b are 52%, 67% and 90%, respectively.

**Table 6.** Detailed confusion matrix for the RF model and the two-class classification. All anomalous samples are separated into the original scenarios.


This conclusion is confirmed by the results of the SVM model (Table 7). Although the overall accuracy is again slightly higher than for the RF model (F1 0.822; balanced accuracy 0.847), the same imbalance is observed, with specificity of 0.997 and sensitivity of 0.698.

**Table 7.** Confusion matrix for the SVM model on the test set in the two-class classification problem.


The same difference among scenarios is observed for the SVM model (Table 8). While Scenario 3b is again well identified (98% accuracy), results are poorer for Scenarios 1b and 2b (57% and 54%, respectively).

**Table 8.** Detailed confusion matrix for the SVM model and the two-class classification. All anomalous samples are separated into the original scenarios.


#### *3.3. One-Class Classification*

#### 3.3.1. Calibration

Three different combinations of parameters featured the highest accuracy, one of which (*ν* = 0.075, C = 0.1, *γ* = 0.05) was taken to fit the final model. Figure 9 shows the results of the calibration process.

#### 3.3.2. Evaluation

The results of the one-class classifier on the test set show similar general figures than for the two-class models (F1 0.920; BA 0.903), but they are more balanced between ability to detect normal operation and anomalies. The figures from the confusion matrix (Table 9) result in a sensitivity of 0.858 and a specificity of 0.948.

**Figure 9.** Result of the calibration process for the one-class SVM model with 5-folds cross-validation. **Table 9.** Confusion matrix for the SVM model on the test set in the one-class classification problem.


Again, these results can be further explored by separating the anomalies into the original scenarios (Table 10). In this case, all classes are predicted with higher accuracy (from 75% for Scenario 4 up to 100% for Scenarios 2a and 3a), at the cost of a higher proportion of false positives, which nonetheless is low (5 %).

Results are better for Scenarios 2a and 3a because their deviation from the reference pattern (Scenario 0) is higher, as can be observed in Figure 5.

**Table 10.** Detailed confusion matrix for the one-class model separated by the original scenarios considered.


#### *3.4. Class Probability*

The previous analyses are based on the raw predictions of the ML models. In this section, we discuss the class probability. For example, RF models include a large number of classification trees, each of which generates a predicted class. The overall prediction is taken as the majority vote for all trees. The value of the predicted probability can be explored to draw more detailed information on the behaviour of the system and make decisions. The prediction of a class with high probability can be expected to be more reliable than others for which two or more classes feature similar probabilities.

Following this idea, the predicted probabilities of the calibrated models for the test set were computed for all scenarios. Figure 10 includes the results for all 4 calibrated models with the classification of the outcome into TN, FN, TP and FP.

**Figure 10.** Predicted probability of belonging to Scenario 0. (**a**) RF multi-class. (**b**) SVM multi-class. (**c**) RF two-class. (**d**) SVM two-class.

This analysis was made with the aim of exploring the possibility of defining some practical criterion to improve the results of the raw predictions. This could be the case for the multi-class RF model: all wrong predictions, both FPs and FNs, correspond to relatively low probabilities for Scenario 0. In other words, predicted probabilities for TN are in general high, and those for TP are low in the vast majority of the cases. This may suggest that an intermediate category of uncertain predictions might be defined including all cases with predicted probability for Scenario 0 in an intermediate range (e.g., 0.2 to 0.4). This would eliminate the FPs and FNs, at the cost of converting a proportion of TPs and TNs into this intermediate category.

The analysis of the plot for multi-class SVM shows the capability of the algorithm to maximise the margin between categories. Probabilities of Scenario 0 in correct predictions are close to 1 for TNs and close to 0 for TPs. The criterion mentioned for RFs is not useful to eliminate the FNs because the few errors feature probabilities above 0.5.

In any case, the main reason for not defining this practical criterion for multi-class models is that their default accuracy is already very high, in addition to the aforementioned limitation of the need to identify a priori and accurately model the anomalies to be detected.

As for the two-class models, the plots show that the separation between classes is less clear. Interestingly, the predicted probabilities of the SVM model for FNs are farther from the 0.5 limit than for the RF model. Again, there is not a clear benefit in using the predicted probabilities for practical purposes.

#### *3.5. Time Evolution of Predictions*

In previous sections, the model predictions were evaluated separately: both false positives and false negatives were assessed in terms of the amount of occurrences as compared to the size of the test set. From a practical viewpoint, the persistence of predictions is relevant when it comes to make decisions regarding dam safety. Anomalies in dam behaviour generally occur progressively, starting by a small deviation from normal operation and increasing in time. In such event, an accurate model would predict anomalous behaviour with persistence in time. In other words, no major decision will be made from a single prediction of anomaly if the subsequent sets of records are considered as normal by the model.

As a result, isolated prediction errors can be considered affordable from a practical point of view. Since the test set corresponds to realistic evolution of external loads and dam response over time (one year of actual measurements), draw relevant conclusions can be drawn from the exploration of the location of errors in time.

This was done for all five models (three prediction tasks and two algorithms). More precisely, the number of consecutive errors—at least two—were computed (either false positives or false negatives) and included in Table 11 together with the overall missclassifications. The results show a large reduction in miss-classifications in all cases as the time window grows. It should be noted that for multi-class models, errors between anomalous scenarios are considered TPs.

**Table 11.** Number of consecutive errors (both false negatives and false positives) by model and prediction task. All errors are also shown for comparison.


#### *3.6. Practical Criterion*

As a result of the previous analysis, a procedure was defined to generate predictions for its application to the validation set. A homologous process was followed for all alternatives used (RF and SVM models for multi-class and two-class classification, and SVM model for one-class):

	- (a) If the model prediction is Normal and equal to previous prediction, i.e., at least two consecutive predictions of no-crack, it is classified as "Hard negative" (HN).
	- (b) If the model prediction is Normal, but the previous prediction was Anomaly, it is considered "Soft negative" (SN).
	- (c) If the model prediction is Anomaly, but the previous prediction was Normal, it is considered "Soft positive" (SP).
	- (d) If the model prediction is Anomaly and equals the previous prediction, it is termed as "Hard positive" (HP).

The evaluation of the results is made on the basis of the errors defined in Table 12.


**Table 12.** Definition of hard and soft errors for evaluation of the modified predictions.

#### *3.7. Validation*

3.7.1. Multi-Class Classification

The confusion matrix for the RF model is included in Table 13. It shows 5 hard errors (all of them HP) out of 2912 cases (0.2%). The proportion of soft predictions is below 5%, which implies that the model can be useful for practical application.

**Table 13.** Confusion matrix for the validation set and the RF model. Multi-class classification.


The results for the SVM model are similar, as can be observed in Table 14. As in previous analysis, the performance is slightly better. In particular, only one hard error is registered, and the amount of soft predictions is lower (37; 1%).



These results demonstrate the capability of both algorithms for identifying behaviour patterns. The SVM model consistently outperformed RF in all analyses, though the difference is small. The calibration effort and required computational time is also similar. In other settings, SVM may require more detailed calibration and some variable selection. It shall be remembered that the amount of inputs is relatively high and that all inputs are highly correlated by their nature. In such a setting, the performance of some classification algorithms may degrade. This was not expected to affect the RF model, which is known to perform well even with many correlated variables, but SVM also provided accurate results without performing variable selection.

The main benefit of this approach is the capability of distinguishing response patterns, not only between normal and anomalous behaviour, but also among different anomalies. By contrast, it has the limitation of requiring the identification and modelling of the expected anomalies. It is thus unclear what the effect of the application of these models would be in practice when some unforeseen anomaly scenario occurs.

#### 3.7.2. Two-Class Classification

The confusion matrix for the RF model and the two-class classification task is included in Table 15. The format of this matrix is unconventional, not only because of the particular definition of the soft predictions, but also because the anomalous situations, which were provided to the model as belonging to a unique Class with label 1, are disaggregated here in accordance with the actual scenario from which they were obtained with the FEM (classes 1a to 4). It should be reminded that the models used for this task were fitted on a training sample including Scenarios 0, 1a, 2a, 3a and 4, and that the anomalous situations in the validation set comprise different anomalies (Scenarios 1b, 2b and 3b) in addition to the normal situation. The prediction task is thus more challenging, but also more realistic, since unforeseen response patterns can be expected to occur in practice.

It can be seen that no HFP are registered for the RF model and the ad hoc criterion defined. The amount of HFN is higher due to the difference in nature of Class 1 samples between the training and the validation set.


**Table 15.** Confusion matrix for the validation set and the RF model. Two-class classification.

In this case, the results for the SVM model is poorer (Table 16), especially for Scenarios 1b and 2b. This may be the effect of the maximization of the margin between categories when applied to samples of different nature. In this case, no HFP are obtained and the amount of soft predictions is 332 (23%).

**Table 16.** Confusion matrix for the validation set and the SVM model. Two-class classification.


The results of this approach for Scenario 3b suggest that it can be useful to detect anomalies only in case they resemble the situations considered for training. By contrast, the model tends to consider as normal those patterns not included in the training data. This is a similar limitation as that described for the MC model, and confirms the conclusions drawn in the previous section.

These classification models fitted with data involving some situations and applied to different anomalies, predict on the basis of the degree of similarity between the new, observed behaviour and those provided for training. Good performance can be expected in terms of anomaly detection when the actual pattern is more similar to some of the foreseen anomalies than to the normal scenario. This is the case of Scenario 3b.

#### 3.7.3. One-Class Classification

The new criterion showed to be useful for OC model. Table 17 shows the confusion matrix for the validation set. The ratio of HFP is low (0.1%). A higher proportion of HFN is observed, though still better than for the TC models (3.8%). It should be reminded that the OC model was fitted using exclusively data from normal operation. This is relevant from a practical viewpoint, since this approach avoids the need for identifying and modelling the anomaly scenarios for model fitting.


**Table 17.** Confusion matrix for the validation set and the SVM model. One-class classification.

#### 3.7.4. Summary of Validation

The models examined include relevant differences in terms of the information used for training and evaluation. Those differences need to be considered when comparing performances. Furthermore, although the anomalous scenarios are initially the same for all tasks, they are included different ways: as different classes (MC), grouped into one single anomalous class with different scenarios in training and testing (TC) or plainly grouped into a global category for all situations different from Scenario 0 (OC).

Keeping these differences in mind, results are summarised and compared in Table 18. It can be seen that the error rates in this case (adding soft and hard errors) are similar than those for the test set (Table 11). This confirms that the model accuracy is representative of the models used and the case study.

The proposed criterion is beneficial for the OC model, in the sense that the majority of raw miss-classifications are turned into soft errors.


**Table 18.** Model comparison for the validation set.

#### **4. Conclusions**

Both RF and SVM showed high prediction accuracy for the multi-class classification task (miss-classification rate below 0.5%), with SVM slightly better than RF. These models have the advantage of being capable of distinguishing between anomalies of different kind, which can be useful when potential failure modes can be well defined and modelled. However, this need may be a relevant limitation in many settings for their practical application. Their capability to detect anomalous patterns not considered for model fitting is unclear.

Two-class classification models can only distinguish between two classes—normal and anomalous behaviour—but they are incapable of differentiating among different anomalies. This approach is more representative of the practical application, where unforeseen patterns, not considered for model fitting, may occur. The results for the TC models show their limitations in real settings. Their capability for identifying anomalies is strongly dependent on the nature of the actual pattern and its relation to the situations used for model fitting. While high accuracy was obtained for Scenario 3b, the proportion of miss-classifications for Scenarios 1b and 2b is too high for considering this approach in practice.

The one-class classifier based on SVM is fitted exclusively on data for normal operation. This is the typical situation in many dams which performed correctly for long periods, and thus the approach can be applied in practice using monitoring data. The results were better than TC models, and overall suggest that this model can be useful in practice. Although the accuracy also depends on the properties of the situation to identify, the model is not biased by the decisions of the modeller regarding which scenario to consider: the ability for anomaly detection of this model depends on the magnitude of the anomaly, i.e., serious anomalies can be detected with higher accuracy. The process is simpler because no anomalous data are required for model fitting: there is no need to create a numerical model and the probable anomaly scenarios need to be neither defined nor modelled. This also enlarges the scope of application to any dam typology and response variable, since some phenomena are difficult to simulate with the FEM. The model can be fitted solely with monitoring data in dams with long series of high-quality records for a relevant number of response variables. In general, a FEM model can be created to complement the time series—e.g., fill periods with missing values.

A practical criterion was defined to classify patterns on the basis of the model outcomes to differentiate predictions as a function of their consistency over time. This resulted in a decrease in miss-classification rate for all approaches. Although the overall conclusions hold for all prediction tasks and algorithms, the utility of the one-class classifier is clearer. This criterion is specific to the case study considered, and thus should be adapted to other situations in accordance with the amount of data available, the reading frequency and other problem-specific properties such as the nature of the potential failure scenario. The work also showed that the time window applied has a relevant effect on the performance of the mode. Engineering judgment and knowledge on dam history should be the fundamentals for setting up a procedure for each specific case.

The main drawback of this approach is that no information is obtained regarding the kind of anomaly identified: the outcome of the model is limited to the probability of belonging to the pattern used for model fitting or some other, without further specification. The combination of this approach with engineering knowledge and some other model either a multi-class classifier or a set of regression models—may result in a more complete pattern identification. The authors are exploring this possibility in an open research line. This involves the need for analysing each output separately, but its application to a set of selected variables can be beneficial to take advantage of the benefits of both approaches, and alleviate their limitations.

Another limitation of these approaches is that high-quality data is needed for model fitting. In this analysis, training data was generated by a FEM model, which ensured that the resulting time series are complete and—in principle—of arbitrary length. By contrast, databases of monitoring data in many dams include periods of missing values, variable reading frequency and other issues. FEM models can be useful for improving the monitoring data to some extent, but still have limitations for some dam typologies, certain failure scenarios and determined response variables. The performance of ML classifiers when fitted with low-quality databases is also the topic of ongoing research.

**Author Contributions:** Conceptualization, F.S. and A.C.; methodology, F.S., A.C., J.I. and D.J.V.; validation, A.C. and J.I.; data curation, A.C.; writing—original draft preparation, F.S.; writing review and editing, F.S., A.C., J.I. and D.J.V.; funding acquisition, F.S. and J.I. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was partially funded by the Spanish Ministry of Science, Innovation and Universities through the Project TRISTAN (RTI2018-094785-B-I00). The authors also acknowledge financial support from the Spanish Ministry of Economy and Competitiveness, through the "Severo Ochoa Programme for Centres of Excellence in R & D" (CEX2018-000797-S), and from the Generalitat de Catalunya through the CERCA Program.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data presented in this study are available on request from the corresponding author.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Juan Mata 1,\* , Fernando Salazar <sup>2</sup> , José Barateiro <sup>1</sup> and António Antunes <sup>1</sup>**


**Abstract:** The main aim of structural safety control is the multiple assessments of the expected dam behaviour based on models and the measurements and parameters that characterise the dam's response and condition. In recent years, there is an increase in the use of data-based models for the analysis and interpretation of the structural behaviour of dams. Multiple Linear Regression is the conventional, widely used approach in dam engineering, although interesting results have been published based on machine learning algorithms such as artificial neural networks, support vector machines, random forest, and boosted regression trees. However, these models need to be carefully developed and properly assessed before their application in practice. This is even more relevant when an increase in users of machine learning models is expected. For this reason, this paper presents extensive work regarding the verification and validation of data-based models for the analysis and interpretation of observed dam's behaviour. This is presented by means of the development of several machine learning models to interpret horizontal displacements in an arch dam in operation. Several validation techniques are applied, including historical data validation, sensitivity analysis, and predictive validation. The results are discussed and conclusions are drawn regarding the practical application of data-based models.

**Keywords:** concrete dam; machine learning methods; structural behaviour; sensitivity analysis; model validation

#### **1. Introduction**

Dam safety is a continuous requirement due to the potential risk of environmental, social, and economic disasters. In ICOLD's bulletin number 138 [1] the assurance of the safety of a dam or any other retaining structure is considered to require "a series of concomitant, well-directed, and reasonably organised activities. The activities must: (i) be complementary in a chain of successive actions leading to an assurance of safety, (ii) contain redundancies to a certain extent so as to provide guarantees that go beyond operational risks" [1]. Continuous dam safety control must be done at various levels. It must include an individual assessment (dam body, its foundation, appurtenant works, adjacent slopes, and downstream zones) and, as a whole, in the various areas of dam safety: environmental, structural, and hydraulic/operational [2].

Structural safety can be understood as the dam's capacity to satisfy the structural design requirements, avoiding accidents and incidents during the service life. Structural safety includes all activities, decisions, and interventions necessary to ensure the adequate structural performance of the dam. The activities performed for the structural safety control of large dams are usually aided by simulation models. According to Lombardi [3]: "the difference between the predicted value and the actual reading is indeed the true criteria to judge the behaviour of the dam". Such predictions can be based on deterministic models, such as finite element models or data-based models. Most large dams have an essential

**Citation:** Mata, J.; Salazar, F.; Barateiro, J.; Antunes, A. Validation of Machine Learning Models for Structural Dam Behaviour Interpretation and Prediction. *Water* **2021**, *13*, 2717. https://doi.org/ 10.3390/w13192717

Academic Editor: Zhi-jun Dai

Received: 30 July 2021 Accepted: 25 September 2021 Published: 1 October 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

database of monitoring measurements, recorded along years, both from the environmental variables (related to the main loads) and the dam response. This, together with the developments in Machine Learning (ML) techniques, has led to a significant increase in the use of ML models to support the analysis and interpretation of the observed structural behaviour. The ML algorithms applied [4] include multiple linear regressions [3,5,6], artificial neural networks [7–16], support vector machines [17,18], random forest [19] and boosted regression trees [20]. A comparison of their performance can be found in [21].

Regression models are the most known and used data-based models for engineers responsible for dam safety activities. They have been validated and tested over years of use, and their capabilities and limitations are well known [21]. In recent years, new data-based models based on ML methods have been adopted as a guaranty in redundancy to the traditional adopted models to describe the observed behaviour or, in some cases, to study a particular aspect of the dam behaviour. However, the growing use of ML models is mainly restricted to scientific publications and academic examples, without a broad and deep discussion about model validation and verification issues. These research and technical gaps are partly because of the lack of agreement regarding how to reliably evaluating the validity of these models and their scope of application. Open issues for their practical implementation include:


In other fields with extended background on the application of data-based models, a further effort was put into developing procedures and concepts for model validation. This is the case of electrical engineering, where Sargent [22] proposed an overall framework for verifying and validating simulation models in any area of expertise.

The problem is more relevant in social sciences, where the decisions of the modeller have a substantial effect on the results because of the higher indetermination in the definition of the prediction task (e.g., [23]). Likewise, when complex databases are used, different decisions may lead to opposite conclusions [24].

The main contribution of this work is the presentation of a methodology for the validation and verification of ML models to the analysis and interpretation of the structural behaviour observed in dams. In this work, Sargent's framework is adapted to dam safety for the validation and verification of data-based models for predicting dam behaviour. The proposed approach is applied to the case study of an arch dam located in Portugal. Models based on neural networks (NN), support vector machines (SVM), random forests (RF), and boosted regression trees (BRT) are fitted for predicting the radial displacement of the highest cantilever. The results are discussed, and conclusions are drawn regarding their application in practice.

The article is organised as follows: Section 2 includes a state of art review regarding the adoption and use of ML models to analyse and interpret the observed structural dam behaviour. The proposed methodology for model validation and verification is presented in Section 3. The case study is described in Section 4. The results of the application of the proposed methods for validation and verification are presented and discussed in Section 5. Finally, conclusions are included in Section 6.

#### **2. State of Art: Machine Learning Models for Dam Behaviour Interpretation and Prediction**

#### *2.1. Overview about Machine Learning*

Machine Learning is usually described as the study of "computer algorithms that improve automatically through experience" [25]. There are two main ML approaches used in different problems: supervised and unsupervised learning, Figure 1. In unsupervised learning, only the input data is available. The objective is to define groups or classes of similar samples, then assigning some class to new inputs. In supervised learning, both the input variables and the true output of the system are available during the training stage. The algorithm learns the association between inputs and response so that predictions can be obtained when new inputs are provided to the model. Supervised learning is the approach used in dam safety, using measured data from the past behaviour of the dam for model fitting.

**Figure 1.** Supervised and unsupervised learning approaches.

Supervised models can in turn be classified into regression, if the output variable is numerical, and classification, in case it is categorical. Although some applications of classification in dam safety can be found in the literature [26,27], the vast majority of examples of ML models in this field are based on regression.

As a result, only supervised regression models are considered in this work. Several mathematical algorithms can be adopted, depending on the goal and the data available, Figure 2. A description of the main algorithms used in the context of the interpretation of the behaviour of dams can be found in [4,21]. As mentioned above, NNs, RF, SVM and BRT are considered in this application. These algorithms are succinctly described herein, together with the conventional multiple linear regression model, which is taken as reference.

**Figure 2.** Main machine learning techniques used in dam safety assessment context.

The regression problem in dam safety can be formulated as follows: suppose that a dataset with *p* independent variables and *n* observations, *X*1, ..., *Xp*,*Y* is available, where *Y* represents the observed structural response, and *X* are the functions of water height above the dam base, temperature and time. The goal of a regression model is computing an estimate of the response variable as a function of the inputs (Equation (1)):

$$Y = f(X\_j) + \varepsilon\_\prime j = 1 \dots p \tag{1}$$

where *e* is the model error.

The overall process for building and applying data-based predictive models is summarised in Figure 3. The monitoring data available is split into two separate datasets. The former is used to calibrate the model parameters (e.g., number of neurons in a NN model), while the latter is fed into the calibrated model for verifying prediction accuracy. Some authors use all available data for training, which may be acceptable if regression models are used. However, leaving an independent data set for model evaluation is essential for most ML models to avoid overfitting.

**Figure 3.** General scheme for fitting and applying predictive models based on monitoring data.

#### *2.2. Formulation by Separation of the Reversible and Irreversible Effects: The HST and HTT Approaches*

In the operation phase of the dam's life, the thermal effect is directly related to the air and water temperature variations. There are two main approaches for choosing the parameters that represent the thermal effect in data-based models [6]: the *HST* (hydrostatic, seasonal, time) approach and the *HTT* (hydrostatic, temperature, time) approach. In the *HST* approach, the portion of the structural response due to the thermal effects is usually considered as the sum of sinusoidal functions with an annual period, similar to air and water temperatures variations. As a result, the thermal effects are smoothed. In the *HTT* approach, the thermal effects are associated to temperatures measured on the dam body, therefore, the actual evolution of thermal loads is considered. Although some authors have presented the benefits of using measured temperatures of the concrete dam body with, or instead of, sinusoidal functions [5], *HTT* models are not often used in day-to-day analysis analysis because of the difficulty in the selection of the variables to represent the thermal effect, especially when there is a large number of input variables. In this work, the *HST* was the approach adopted to obtain the main results. More details regarding the input variables usually adopted in this type of model can be found in [5–7,21,28].

The data-based models used for the prediction of the structural response of concrete dams are based on the following simplifying assumptions:


resulting from the variations of the hydrostatic pressure and the temperature) and another part of the inelastic nature (irreversible) such as a time function.

(iii) The effects of the hydrostatic pressure, temperature, and time changes can be evaluated separately.

#### *2.3. Machine Learning Models Used for the Dam Behaviour Interpretation and Prediction*

Multiple Linear Regression (MLR) models are widely used by dam engineers. The form of Equation (1) for a MLR model relating the independent variables to the dependent variable can be written as (Equation (2)):

$$Y = \beta\_0 + \beta\_1 X\_1 + \beta\_2 X\_2 + \dots + \beta\_j X\_j + \dots + \beta\_p X\_p + \varepsilon \tag{2}$$

where *e* stands for the random error. The model can be represented by a system of *n* equations that can be expressed in a matrix notation as *Y* = *Xβ* + *e* where *Y* is a (*n* × 1) vector of the dependent variable or response, *X* is a (*n* × (*p* + 1)) matrix of the levels of the *p* independent variables, *β* is a ((*p* + 1) × 1) vector of the regression coefficients, and *e* is a (*n* × 1) vector of random errors.

The model assumes that the expected value of the random error is zero, i.e., *E*(*e*) = 0; the variance *V*(*e*) = *σ* <sup>2</sup> and the errors are uncorrelated [29–31]. In matrix notation, the least squares estimator of *β* is *β*ˆ = *X <sup>T</sup>X* −<sup>1</sup> *X <sup>T</sup>Y* while the fitted model is *Y*ˆ = *Xβ*ˆ and the vector of the residuals is denoted by ˆ*<sup>e</sup>* <sup>=</sup> *<sup>Y</sup>* <sup>−</sup> *<sup>Y</sup>*ˆ.

The general expression of the regression problem shown in Equation (1) applies to ML regression models. However, the final mathematical expression is in general more complex than that of the MLR, and, therefore, more difficult to analyse.

Neural networks can be considered as an extension of MLR models by adding a non-linear transformation of the inputs [32]. The output of the model is computed as presented in Equation (3):

$$\mathbf{y}\_{NN} = \sum\_{l=1}^{L} w^l \cdot \mathbf{g}(\mathbf{X}\_l) + b \tag{3}$$

where *L* is the total number of neurons in the hidden layer, *w <sup>l</sup>* are the weights, *X<sup>t</sup>* is a linear transformation of the inputs *X<sup>j</sup>* , and *g*(*Xt*) is a non-linear transformation. In this work, a sigmoid function is used in the hidden layer, different for each neuron *l*, which is computed as presented in Equation (4):

$$\lg(X\_t) = \frac{1}{1 + e^{X\_t}} \tag{4}$$

Many variations of the original NN model have been proposed and used in different fields: from the addition of several intermediate layers until the introduction of complex algorithms to account for certain aspects of the model [33,34]. NN can be considered the most popular ML algorithm in dam engineering [7,10,17].

In general, NN are prone to overfitting, i.e., their high flexibility allows for increasing the prediction accuracy for the training data (in theory, perfect accuracy could be achieved [35]). This issue can be alleviated using cross-validation or similar resampling techniques.

However, this does not imply that such model will be as accurate for an independent data set. Therefore, the training process needs to be followed with care. An independent data set needs to be considered to evaluate the model, for which prediction error should be similar to that in the training set.

Support vector machines resemble NNs in some aspects, such as the inclusion of nonlinear transformations of the inputs. In this case, the inputs are transformed, then linear regression is performed on the modified variables. Overfitting is also an issue for SVM models, and the training parameters strongly influence the results. Consequently, SVM models must be fitted with care: wide ranges of parameter values need to be considered, and cross-validation or similar approaches must be followed to obtain reliable estimates on

the prediction accuracy. SVM have been used in some publications related to dam safety, but the effect of the model parameters was not examined in depth [4,11,17,21].

Other ML algorithms recently applied to the prediction of dam behaviour belong to the family of tree-based models: random forests (RF) and boosted regression trees (BRT). In both cases, the model outcome is computed from a typically large amount of simple models, in this case, regression trees.

Regression trees are often used in classification problems because of their ease of interpretation: usual rules can be represented by these models in the form of criteria for dividing the original dataset into a series of groups with common features. However, such approach can also be applied for regression, i.e., for predicting numerical variables, by computing the average of the output variable for all elements in each of the resulting groups. This approach keeps the interpretability, but prediction accuracy is lower than for other algorithms. In addition, the resulting model is strongly dependent on the training set, i.e., the addition of a few samples may result in highly different rules and predictions. To overcome the limitations of regression trees, RF were proposed by Breiman [36], in which a significant amount of simple regression trees is created, and the result is computed as the average of their predictions. Additional ingredients of the RF algorithm include:


RF models are easy to implement and robust in complex settings [36], particularly when the *<sup>p</sup> n* ratio is high [37].

BRTs are also based on trees, with some differences:


BRTs share some of the advantages of RFs regarding their robustness (low dependence on the training parameters) and a lower tendency to overfitting [20].

An in-depth description of the mathematical foundation of these algorithms is out of the scope of this article. They can be found in many publications with different viewpoints, from theory to practice, a selection of which is included in Table 1.


**Table 1.** Reference publications on the ML algorithms considered.

#### **3. Methodology Proposed for Model Validation and Verification**

The main concepts coined for general verification and validation of simulation models can be summarised, based on Sargent's framework [22], as follows:


A model may be valid for one set of experimental conditions and invalid for another. Performing model verification and validation is usually part of the (entire) model development process. The graphical paradigm developed by Sargent [22] was adapted for dam engineering (Figure 4).

**Figure 4.** Simplified version of the modelling process for dam engineering, adapted from [22].

These definitions presented by Sargent [22] were updated for the case of data-based models in the field of dam safety and dam engineering. Thus, specialists in dam safety activities can consider the verification and validation of ML models for the interpretation of structural dam's behaviour that:


ature variations similar to a sinusoidal shape with an annual period? Can the effects of the hydrostatic pressure, temperature, and time changes be evaluated separately?


Verification and validation must be performed again when any change is made to the model. For problem entity, some changes can result from some incident, change in the properties of the dam body (e.g., due to changes in concrete proprieties due to alkalis-silica reactions) or even improvement works performed in the structure or surrounding, e.g., due to tunnel excavations near the dam body for the construction of power boosts. In typical situations, these models are used to characterize the pattern of the structural dam behaviour in normal conditions of exploitation and then identify possible changes in the structural condition of the dam.

In addition to the work used as the primary reference, several works [49–54] present several validation techniques in different areas of expertise. Some of them have been used in dam engineering for the validation of traditional models.

Based on the referred before, the authors propose the adoption of the following techniques for the verification and validation of ML models for dam behaviour interpretation: historical data validation, parameter variability-sensitivity analysis, predictive validation, comparison to other models, and analysis of the time evolution of the residuals, as presented in Figure 5 and succinctly described in Table 2.

In dam engineering, the HST approach is consensually accepted, as referred before. This approach assumes that the effects of the main loads (water level and temperature variations) on the structural response can be considered independently (known as separation effects). This can be checked by applying a sensitivity analysis as proposed in this work. Indirectly, the potential risk of dams (low probability of accident but with very high consequences) is an aspect that justifies the need for robust model validation and verification process.

**Table 2.** Description of the proposed techniques for model validation and verification of ML models for dam behaviour interpretation, adapted from [22].


The mathematical calculation and the graphic representation presented in the following sections were supported by the R project software and several packages [55–62].

#### **4. Case Study**

#### *4.1. The Salamonde Dam*

Salamonde dam (Figure 6) is located in the Cávado River, which flows through northwestern Portugal, being part of the hydro-electrical system of Cávado-Rabagão-Homem owned and operated by EDP [63]. The Salamonde dam consists of a double curvature arch. The construction completion date was 1953. The maximum dam height is 75 m, and the total crest length is 284 m. The maximum reservoir water level is 280.0 m, with a total storage capacity of 65 hm<sup>3</sup> . Following the best technical practices, the monitoring system of the Salamonde dam aims at the evaluation of the loads, the characterization of

the rheological, thermal and hydraulic properties of the materials, and the evaluation of the structural response.

The monitoring system of the Salamonde dam consists of several devices, which measure physical quantities such as concrete and air temperatures, reservoir water level, seepage and leakage, displacements in the dam and on its foundation, joint movements, and pressures, among others. The deformation of the dam body is controlled with five inverted pendulums (FP1 to FP5).

In this case study, the measurements of the (absolute) horizontal displacements (radial direction) measured at the highest base (at 277.07 m) of the combined FP3 and FP5 pendulums is analysed (FP3/5–277.07 m). The location of the FP3/5–277.07 m base is shown in Figure 6.

**Figure 6.** Salamonde dam: Upstream and downstream dam faces and pendulum distribution.

#### *4.2. The Analysed Data*

The data analysed corresponds to a period between January 2000 and December 2019, resulting in more than 934 observations per variable. The data between January 2000 and December 2014 was used for training the ML models. The dam behaviour and its structural condition during the period of the training set, are considered adequate by the dam engineering specialists. So, the main purpose is to obtain ML models able to represent the behaviour pattern observed in the training set with a good generalization capability. Once the ML models are generated, the purpose is to use them in the day-to-day dam monitoring and safety activities. Thus, the data from the time period between January 2015 and December 2019 was adopted as a predicted set. The time evolution of the reservoir water level and radial displacements in the referred FP3/5–277.07 m base are presented in Figure 7. The statistical characterization of the radial displacement and of the water height variations is presented Table 3.

**Figure 7.** Radial displacements in FP3/5–277.07 m and reservoir water level along time.

**Table 3.** Statistical parameters for the radial displacement measured in the FP3/5–277.07 m and for the reservoir water level.


The structural response of the displacement at any point of the dam is strongly related to the corresponding variation in the water level in the reservoir. The observations presented in Figure 7 were used for the computation of the models presented in this work. Signs (+) indicate displacements towards upstream and signs (−) towards downstream.

A characterization and knowledge of the main data prior to any deep analysis are essential to avoid misinterpretation of the dam behaviour. One part of this task is usually performed through expedite data visualization, as presented in Figure 7. Figures 8–11 can also support this first analysis in order to identify the domain of the training set, in terms of (i) the water level (Figures 8 and 9) and also the thermal (indirectly to the range of data in a month) loads and, (ii) in terms of the displacements observed (Figures 10 and 11).

**Figure 8.** Water height from 2000 to 2019, by month (training set: from 2000 to 2014 and predicted set: from 2015 to 2019).

**Figure 9.** Water height from 2000 to 2019, in polar coordinate system, by month (training set: from 2000 to 2014 and predicted set: from 2015 to 2019).

**Figure 10.** Radial displacements in FP3/5–277.07 m from 2000 to 2019 by month (training set: from 2000 to 2014 and predicted set: from 2015 to 2019).

(**a**) Positive values of measurements. (**b**) Negative values of measurements.

**Figure 11.** Radial displacements in FP3/5–277.07 m from 2000 to 2019 in polar coordinate system, by month (training set: from 2000 to 2014 and predicted set: from 2015 to 2019).

#### *4.3. The ML Models Developed*

Once a basic knowledge about the main loads' variations and radial displacement is achieved, the ML models were developed following the HST approach. The ML methods used were: the Multiple Linear Regression (MLR), Support Vector Machine (SVM), Multilayer Perceptron Neural Network (NN), Random Forest (RF) and Boosted Regression Trees (BRT).

Although different ML models were considered in this work, the article focuses on the methods and criteria for model validation and verification rather than on the implementation of these models or their capabilities and limitations. This implies that the models showed are not optimal, but they present an adequate generalization capacity. In general, default training parameters were considered and a reduced amount of input variables was included, being the same terms used for all models.

The MLR model was used for comparison since it is widely used and well known in the dam engineering community. In addition, the radial displacement of an arch dam was chosen, for which the HST approach is appropriate.

In this case study, the MLR model with the best performance for the radial displacement of the FP3/5–277.07 m, *yMLR* was obtained as the sum of the hydrostatic pressure term *β*<sup>4</sup> × *h* 4 (where *h* is the reservoir water level height and can vary between 0 and 75 m) and the temperature terms *β*<sup>5</sup> × sin(*d*) + *β*<sup>6</sup> × cos(*d*) to represent the effect of the annual thermal variation of the temperature, where *d* = 2*πj* <sup>365</sup> and *j* is the number of days between the beginning of the year and the date of the observation. No relevant variation was recorded on the past behaviour of Salamonde Dam. This was verified in preliminary fits of the MLR model, including time-dependent terms. The results (not shown) confirm the negligible effect of time. Therefore, no time effect was considered in any ML model.

The input variables (*h* 4 , sin(*d*) and cos(*d*)) were used in the same form in all the ML models. They were previously transformed in order to vary between zero and one. The regression coefficients of the MLR models obtained are *β*<sup>4</sup> = −26.445, *β*<sup>5</sup> = −12.378 and *β*<sup>6</sup> = −12.624, with *β*<sup>0</sup> = 42.125; being the MLR model represented through the Equation (5).

$$y\_{MLR} = -26.445h^4 - 12.378\sin(d) - 12.624\cos(d) + 42.125\tag{5}$$

The residuals were obtained through the difference between the observed horizontal displacement and the corresponding predicted value. These values contain all information that the model cannot explain.

Regarding the development of the remaining ML models:


Higher prediction accuracy may be obtained for the case study after a throughout selection of variables and model calibration, but that is out of the scope of this article. By contrast, stress is put on analysing the outcomes of the models and their comparison with reference methods and engineering knowledge.

#### **5. Results and Discussion**

The results obtained from the proposed model validation and verification techniques, splited in five steps, are presented and discussed in the following sections (Figure 12).

**Figure 12.** Workflow of the proposed techniques for the verification and validation of ML models for dam behaviour interpretation.

#### *5.1. Step 1: Historical Data Validation*

The availability of monitoring data for a certain period is a requirement for fitting any predictive data-based model. In general terms, reserving part of the data for validating the model is always a good practice. The primary step in model validation is thus the comparison between model predictions and the observed response for the time period used for model fitting (training period). Once a model is obtained (e.g., a good performance and generalization capacity are expected) and it is put into operation, new data is presented to the model. In the case study, this new data, from the predicted period, results from measurements obtained after the last measurement record of the training period. Being the

training and the predicted periods part of historical data, a similar pattern within the two periods is expected if there is no change in the behaviour of the reservoir-dam-foundation system. As a consequence, a similar performance is expected when the predicted set is presented to the model.

Three different ways of comparison are suggested. First, the time series of predictions can be plotted together with the observations, i.e., placing the date in the horizontal axis and both predictions and observations on the vertical one. This allows for analysing whether the model predictions follow the variations observed due to changes in water level and air temperature. A detailed analysis of this plot also permits identifying periods of large errors, which can be later analysed in detail. Placing the time series of residuals, i.e., the difference between predictions and observations is also a good practice. Figure 13 shows these plots for all models considered.

The nature of the predictive model needs to be considered for the interpretation of these plots. In particular, the MLR model is less flexible and, therefore, usually presents more bias and less variance. It can be taken as a reference for assessing the variance of ML models, more prone to overfitting.

The reading frequency for water level is higher than for the response variable in the selected case study. This is often the case in many dams without a fully automated monitoring system. As a result, model predictions can be plotted with a higher degree of detail, and thus conclusions can be drawn from the time evolution of predictions.

Having a low training error is a necessary condition for a ML model to be effective, but it is not sufficient for ensuring high accuracy in practice. All ML models may overfit the training data, in which case actual prediction accuracy, e.g., that obtained when new data is fed to the model, is much lower. However, computing some error metrics is a useful first step in model validation. Table 4 presents some performance parameters for the training set: the mean error, mean(*e*), the mean absolute error, mean(|*e*|), the maximum absolute percentage error, *MAPE*; the maximum absolute error, |*emax*|, the minimum absolute error, |*emin*|, and the root mean square error, *RMSE*. As a complement to the table and plots presented, the density function of the residuals is also useful, Figure 14. It can be seen that the overall density functions resemble the normal distribution for all ML models, while that of MLR shows a left tail.

**Table 4.** Performance parameters for the MLR, SVR, NN, RF and BRT models regarding the training set.


**Figure 13.** Measurements of the training set and output of the Machine Learning models for the horizontal upstream-downstream displacements at FP3/5–277.07 m base, block GH.

**Figure 14.** ML model residuals distribution for the training data.

The performance metrics for the training set show higher accuracy for the SVR, NN, RF and BRT models than for MLR model. This can be taken as a first verification of the

validity of the calibration process. However, this result should be confirmed by the results on an independent data set, not used for model fitting.

It should be noted that in spite of the low values of the mean absolute error and the RMSE, MAPE is very high. This is because the target variable includes values close to zero, for which even small prediction errors can result in very high MAPE (the error is divided by the observed value). It is thus important to consider several performance parameters.

The graphical analysis of residuals (difference between predictions and observations) is another helpful tool in model validation. This can be done by exploring the predictions vs. observations plot, which is a conventional way of analysing predictive models. Adding a straight line through the origin with a 1/1 slope helps the analysis, since it corresponds to a perfect fit. These plots are included in Figure 13 for all models. In addition to the spread of the results and their distance to the perfect fit line, these plots allow verifying the sign of the errors and detecting changes for certain regions of the range of variation of the target variable.

A modified version of the plots mentioned above that can be interesting in dam safety analysis implies the separation of the overall predictions-observations couples as a function of the month the measurements were taken. Figure 15 includes this representation for all predictive models and the training data. Vertical lines were added to highlight the range of variation of the response variable for each month. This is highly relevant for all models, but particularly for those based on ML: their applicability is restricted to such range of variation of the data. Their higher flexibility implies that their reliability greatly decreases when new inputs are taken out of the scope of the training data. So, the application of any ML model (even MLR models) for new data out of the domain of the training set is not recommended because meaningless or even erroneous results may be obtained.

#### *5.2. Step 2: Sensitivity Analysis*

The identification of the effect of each external variable is often used when analysing predictive models in dam safety. Different procedures can be applied, partially dependent on the nature of the model. ML techniques are often criticised for being "black boxes", difficult to interpret. These models can capture interactions among inputs and non-linearities, making them more complex and challenging to analyse. Nonetheless, some specific methodologies for model interpretation can be applied to particular algorithms. For example, relative influence and partial dependence plots helped detect thermal inertia in an arch dam from a model based on BRT [20]. Other techniques have been proposed for the same purpose [64].

In the case of MLR, the conventional procedure implies plotting the predictions when the input under analysis is varied along with its range of variation while other inputs are fixed to their mean values. This is the simplest form of sensitivity analysis. This approach has limitations since the external effects are not independent (water level variations affect the thermal field in the dam body). However, it is simple and well known and can be applied to any predictive model. Therefore, it was used in this work. The results are shown and discussed herein.

**Figure 15.** ML model values vs measurements of radial displacements in FP5/5–277.07 m from 2000 to 2014 (training data). The vertical lines are the maximum and minimum measured values obtained during the training set.

#### 5.2.1. Effect of Water Level

The analysis of the effect of water level as learned by each predictive model was made based on the plots shown in Figure 16. Curves show models predictions for the 15th day of each month, a vector of values of the water level taken with an interval of 0.1 m, along the possible range of variation. The available measurements are also plotted for reference, as well as vertical lines highlighting the maximum and minimum observed values. However, the distance from the curves to the observations does not reflect the actual prediction errors, since the actual day of record is, in general, different from the 15th for the observations.

The effect of hydrostatic load on the radial displacements in arch dams is well known: higher water levels are associated with deformation toward the downstream side and vice-versa. This was captured by all models considered, though relevant differences can be mentioned.

The curves for models based on trees (BRT and RF) show high non-linearities for some months, e.g., February, June, August. In addition, predictions follow a series of steps as the water level increases. This is due to the underlying mechanism for fitting these models. The regression trees used are created by dividing the input space into adjoint regions and computing predictions independently.

**Figure 16.** Sensitivity analysis due to the hydrostatic effect for Machine Learning model results based on the training data. The vertical lines are the maximum and minimum measured values obtained during the training set.

By contrast, NN and SVR, both based on non-linear smooth transformations of the inputs, resulted in smooth effects. In general, this can be considered as more representative of the actual effect of the hydrostatic load. Nonetheless, the results for some months are not in accordance with engineering knowledge. For instance, the curve for July of several models shows a horizontal part followed by a stretch with a high slope, then another inflection. Similar shapes are obtained for May and June. This can reasonably be attributed to some degree of overfitting. It should be noted that a small amount of observations is available for these months and low water levels, whereas the density of records for high levels is greater than for low levels. By contrast, the SVR model shows low variance without inflection points and results close to those obtained from the MLR model.

Since predictions were generated for the whole range of water level variation and all months, this plot is also helpful in observing the behaviour of the models when extrapolating. Tree-based models take constant values when predicting out of the training range. This results in close to horizontal sections of the curves (e.g., January, February, March). The results for other models show the increasing variance from MLR (the lowest) to SVR and NN.

The reliability of the results for all models is nonetheless poor for low water levels and winter months since the information available in the training set for those conditions is also poor.

However, the disagreement observed between the results for the water level effect and the physical phenomenon does not imply that the correspondent models have to be discarded for prediction. Their performance can be useful as far as the input variables remain within the range of variation of the training set. This is actually the situation in the case study: the reservoir level in the prediction period remained high (Figure 7).

#### 5.2.2. Effect of Temperature

A similar process was followed to compute the effect of temperature on the radial displacement. It should be remembered that temperature is indirectly considered in the models as a function of the calendar day. Therefore, the curves are calculated in this case from the predictions for each date. As for the water level, the range of variation from the training set was divided into 8 regions with the same amount of records. Predictions are made from the mean level for each region.

Figure 17 show the results. The size of the intervals reflects the scarcity of records for low water levels: the first two regions include around 8 m of water variation to have as many records as for 1–3 m with a high water level.

**Figure 17.** Sensitivity analysis due to the thermal effect for Machine Learning model results based on the training data. The vertical lines are the maximum and minimum measured values obtained during the training set.

The shape of the curves of all models again matches the knowledge on the physical process: high temperatures result in concrete expansion, which generates deformations in the upstream direction due to the mechanical restrictions of arch dams. The same reason results in deformation to the downstream side for the first months of the year.

The thermal effect is also well known in these structures and was captured by the models. The higher deformations toward upstream do not match up with the hottest months, but some time later. This reflects the delay between changes in air temperature and the thermal field in the dam body.

#### *5.3. Step 3: Predictive Validation*

The same methods used for historical data can be applied to the recent period of records, which was not used for model fitting.

The analysis of predictions for this period shall focus on detecting whether overfitting exists in ML models. First, prediction errors were computed (Table 5). The highest prediction accuracy was obtained for MLR, though differences were slight.

The comparison between these results and those for the training set reveals a relevant difference between MLR and ML models: while the performance of MLR is better for the prediction set, all the others ML models featured lower accuracy. This result is reasonable and shows the tendency to overfitting of ML models. MLR is less flexible, which is a limitation under some circumstances (if more relevant inputs are involved or if response variables of a different nature need to be considered) and reduces the risk of overfitting. It can be concluded that training error is a good estimate of generalization capability for MLR but not for ML models. This implies that using a separated part of the training set to evaluate model accuracy is essential when ML models are used (e.g., NN), but not as critical for the MLR model.


**Table 5.** Performance parameters for the MLP, SVR, NN, RF and BRT models regarding the predicted set.

The difference between accuracies on both datasets also depends on the distribution of the target variable in both sets. In the case study shown, the water level in the prediction period remains high. Consequently, the performance metrics of the models for this period correspond to their predictive capability for high water levels. In this case, since high levels were more frequent in the training set, model accuracy is also higher. Hence the overall prediction accuracy can be expected to be lower for all models. This should be taken into account in practice if the model is applied during a period of low hydrostatic load.

As in the training set, MAPE is high for all models for the same reason: a relatively low error for small target values may result in extremely high MAPE. As an example, the maximum MAPE for the MLR model is 560% for an observed displacement of 0.7 mm. This suggests that MAPE can be misleading when applied to certain target variables.

Figure 18 shows the time series of predictions and observations with the corresponding residuals. A first exploration of these plots may lead to the conclusion that all models have excessive variance: the time series of predictions is noisier than that of observations. However, it should be reminded that the reading frequency of the water level is higher than that of the displacements, and all predictions are shown in the plot. In addition, the time series of reservoir level is indeed noisy (maybe due to the daily variations of the water height in the reservoir). Taking these factors into consideration, the variance in predictions seems reasonable. In addition, similar variability can be observed for MLR and ML models. These conclusions were confirmed from the comparison between the standard deviation of the observations and that of the predictions from each model (see Table 6).

The plot of predictions vs. observations (Figure 19) shows no relevant differences to that for the training set. However, this kind of representation may be useful to check the performance and confirm that the prediction is within the domain of the horizontal displacements considered in the training set (vertical lines in the figure).


**Table 6.** Standard deviation of predictions and observations for the training and predicted periods.

**Figure 18.** Measurements of the predicted set and output of the Machine Learning models for the horizontal upstream-downstream displacements at FP3/5–277.07 m base, block GH.

#### *5.4. Step 4: Time Evolution of Residuals*

The effect of time is often interpreted as the contribution of irreversible deformations to the displacements since, in principle, such term encompasses all effects not explained by the loads. This is the case of displacements in arch dams. In other typologies and response variables, other inputs may play a role and a more detailed analysis is recommended.

In the case study considered, preliminary tests on the MLR models showed no relevant irreversible effects. As a result, time was not considered as input in the models. However, further verification can be made even in models where time was considered negligible. Since model predictions are solely based on the external loads (temperature indirectly considered from the calendar day), the temporal evolution of the residuals may reveal unforeseen irreversible effects. This approach is similar to the analysis of the "Corrected measurements" proposed by Guo et al. [65]: the difference between model predictions without time-dependent input and observed data can be considered as the evolution of the response of the dam for identical load conditions over time. It can be thus be expressed as presented in Equation (6):

$$\text{CM} = \text{Y}\_{\text{obs}} - \text{Y}\_{\text{pred}} \tag{6}$$

provided that time is not considered in the calculation of *Ypred*.

Guo and co-authors also suggested fitting a linear model to the *CM* to draw conclusions on the evolution of the dam behaviour with the Equation (7):

$$CM = a\_0 + a\_1t + \epsilon \tag{7}$$

where *t* is the time.

**Figure 19.** ML Model values vs measurements of radial displacements in FP5/5–277.07 m from 2015 to 2019 (predicted data). The vertical lines are the maximum and minimum measured values obtained during the training set.

The linear model can later be analysed to identify irreversible effects. This procedure was followed for all models. In all cases, the slope of the linear fit is positive and small, which might indicate a temporal evolution of the deformations toward the upstream direction, Figure 20. However, the fit of the linear model to the residuals is also poor for all models (R<sup>2</sup> below 0.1), so no reliable conclusions can be drawn in this case.

**Figure 20.** Residuals along the training and the predicted sets.

#### *5.5. Step 5: Comparison to Other Models*

The interpretation of the structural behaviour through different models is fundamental to support an informed decision. This step was performed along with the four steps before, verifying that the presented ML models are suitable for the interpretation of the horizontal displacement analysed in the case study.

#### **6. Conclusions and Final Remarks**

In the day-to-day safety control of dams, data-based models are developed to analyse and predict structural dam behaviour (such as displacements in a concrete dam), considering main loads effects (such as the hydrostatic, temperature, and time effects).

The use of data-based models, namely based on *HST* and *HTT* approaches, using machine learning techniques, has had an interesting evolution in dam engineering. Worldwide, the increase in the day-to-day use of ML techniques is expected, mainly in large hydroelectric companies and entities responsible for dam safety control activities. In order to ensure that models are adequate, validation and verification of the models by specialists with expertise in dam engineering is necessary. In this work, the authors present several recommendations, with scientific and technical frameworks, to validate and verify ML models.

Based on the specificity of the dam safety control activities, five techniques to perform model validation and verification were proposed and applied to the following ML models: multiple linear regression, support vector machine, multilayer perceptron neural network, random forest, and regression trees. These five techniques were: historical data validation, sensitivity analysis, predictive validation, time evolution of the residuals, and comparison to other models. All the models presented showed to be suitable for the analysis and interpretation of the horizontal displacements presented in the case study. The proposed validation and verification of the ML models is based on five techniques that were never performed simultaneously. Sensitivity analysis is performed on all ML models, which is crucial for improving confidence in their day-to-day use.

Performing model validation and verification is part of the model development process. Our reflections and contributions are based on the article presented by Sargent [22] and on the authors' experience acquired along years of activity in the field of dam safety control. They can be summarised as follows:


ML methods have proven to be an interesting and suitable tool for developing databased models, increasingly relevant mainly when the amount of information increases with the measurement history. Like any tool, it has to be used by specialists with a broad knowledge of how it works in order to obtain data-based models that satisfy the needs of the dam safety control activities. Within this continuous process, the validation and verification of the ML models adopted are key to having suitable models for reliable analysis and interpretation of the dam behaviour.

The proposed methodology for validation and verification of ML models can be applied to the prediction and analysis of other dam typologies and physical quantities, such as those related to the dynamic behaviour (e.g., frequencies modes) and hydro-mechanical phenomena (uplift pressure and seepage). This can increase the credibility of ML models in the dam engineering community and thus foster their widespread implementation.

**Author Contributions:** Conceptualization, J.M. and F.S.; methodology, J.M. and F.S.; software, J.B., A.A., J.M. and F.S.; validation, J.B. and A.A.; formal analysis, J.M., F.S., J.B. and A.A.; investigation, J.M., F.S., J.B. and A.A.; data curation, J.B. and A.A.; writing—original draft preparation, J.M. and F.S.; writing—review and editing, F.S., J.B. and A.A.; visualization, J.B. and A.A.; supervision, J.M. and F.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** The contribution of the second author work was partially funded by the Spanish Ministry of Science, Innovation and Universities through the Project TRISTAN (RTI2018-094785-B-I00), by the Spanish Ministry of Economy and Competitiveness, through the "Severo Ochoa Programme for Centres of Excellence in R & D" (CEX2018-000797-S), and from the Generalitat de Catalunya through the CERCA Program.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Restrictions apply to the availability of these data. Data was obtained from EDP-Energias de Portugal and are available from the authors with the permission of EDP-Energias de Portugal.

**Acknowledgments:** The authors acknowledges the company EDP-Energias de Portugal that provided the data for the procedures addressed in this paper, and LNEC through its research program RESTATE (0403/112/20970). The authors would like to thank the anonymous referees for their suggestions and comments.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Rikard Hellgren \* , Jonas Enzell , Anders Ansell , Erik Nordström and Richard Malm**

Department of Civil and Architectural Engineering, KTH Royal Institute of Technology, 10044 Stockholm, Sweden; jonas.enzell@byv.kth.se (J.E.); anders.ansell@byv.kth.se (A.A.); enords@kth.se (E.N.); mmd@kth.se (R.M.) **\*** Correspondence: rhellg@kth.se

**Abstract:** In the assessment of concrete dams in cold climate, it is common that the theoretical stability becomes insufficient for load cases that include ice loads. However, the magnitude and return period of these ice loads have a high degree of uncertainty. This study estimates the magnitude of ice loads on eight concrete dam monoliths using measurements of their displacement from 29 winters. In the displacement signals, events are identified and assumed to be caused solely by ice loads. The observed displacement during an event is interpreted as an ice load using a load–displacement relationship derived from FE simulations of each dam. These simulations show that ice loads of the magnitudes given in design guidelines and recorded in previous measurements would significantly affect the structural response of the studied dams. However, only small traces of ice loads can be found in the observed responses of the studied dams. The estimated ice loads are significantly lower than the ice loads recorded in traditional ice load measurements. These results indicate that the average magnitude of ice load on an entire monolith is significantly lower than the measured local pressures. This would imply that ice loads may be a smaller concern regarding dam safety than previously believed.

**Keywords:** ice loads; concrete dams; back-calculation; dam safety; monitoring

#### **1. Introduction**

The ability to store water for electricity production, agriculture, or consumption is essential for modern societies. A fundamental prerequisite for water storage is safe dams. A failure of a dam may result in catastrophic consequences. Further, the construction and rehabilitation of dams have a high economic and environmental cost. These costs can be reduced if a dam's operation without measures can be prolonged. To do so, with a maintained high level of safety, adequate assessments are required. Therefore, refined analyses and increased knowledge about the loads that act on dams facilitate prolonging their life span, resulting in substantial economic and environmental benefits.

Dams are mainly designed to withstand the loads caused by the impounded water. In addition to the pressure from the water, dams in cold regions may be exposed to loads from an ice sheet. This ice load is caused by the restrained expansion or movement of the ice and may constitute a significant fraction of the total horizontal loads. Current guidelines for ice load on dams are typically based on the geographical location of the dam but do not consider the local conditions [1–5]. In these guidelines, the design ice load varies between 50 kN/m and 250 kN/m. Such magnitudes often cause a theoretically insufficient stability for load cases that includes ice loads in assessments of concrete dams in cold climate. However, the magnitude and return period of the ice loads are today among the most considerable uncertainty when evaluating dams in cold regions [6]. Furthermore, there is a discrepancy between the number of dams where ice load is a theoretical problem, and the fact that few dam safety incidents have been reported where large ice loads has been identified as a cause.

**Citation:** Hellgren, R.; Enzell, J.; Ansell, A.; Nordström, E.; Malm, R. Estimating the Ice Loads on Concrete Dams Based on Their Structural Response. *Water* **2022**, *14*, 597. https://doi.org/10.3390/w14040597

Academic Editors: M. Amin Hariri-Ardebili, Fernando Salazar, Farhad Pourkamali-Anaraki, Guido Mazzà and Juan Mata

Received: 21 December 2021 Accepted: 8 February 2022 Published: 16 February 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

There are four categories of methods for measuring the ice load on fixed structures: interfacial measurements, methods based on Newton's second law, internal ice stress measurements, and structural response monitoring/hindcast calculations [7]. A recent systematic literature review of ice load measurements on dams compiled 123 unique recordings of seasonal maximum ice loads [8]. An overwhelming majority of these recordings are performed with local stress sensors or interfacial stress sensors that are considerably smaller than the dam–ice interaction area of interest. This difference leads to extrapolation and uncertainty regarding the representativeness of the measured local stresses, compared to the global structural load. One way to overcome this issue is by using the third measurement method, structural response monitoring (SRM). With this method, the ice loads are back-calculated from the measured response of the structure. SRM is the only ice load measurement method that determines the global load [7]. The method is also relatively cheap to implement and use, only requires work performed on land above water, and the measurements can be used as a part of the dam owners continuous safety monitoring. Despite these advantages, only a few cases are reported where this method has been used for dams [9–12].

This study addresses this issue by back-calculating the ice load from observed displacement from dam safety monitoring of eight dam monoliths from five different dams. This is the first study to quantify the magnitude of ice loads on concrete dams based on their structural response. The back-calculated ice loads also add empirical data on the magnitude of global ice loads.

Sections 2–4 describe the methods and materials of this study. The two back-calculation approaches used to separate the displacement caused by ice loads from those caused by other loads are described in Section 2, and Section 3 describes the studied dams. Section 4 presents the analysis methods and their application. These methods include finite element simulations of the ice–load–displacement relationship and the transient behavior of the dams, data-based models to predict the behavior of the dams and calculations of ice loads. The research findings are presented and discussed in Section 5, focusing on three key themes: the expected structural response of the dams from ice loads, the accuracy of the applied methods and derived ice loads, and their practical implications. The conclusions from the research are summarized in Section 6. Table 1 explains the abbreviations that are used in figures and tables throughout the paper.


**Table 1.** Abbreviations used in the paper.

#### **2. Back-Calculation of Ice Load**

In the structural response monitoring method, the size of the ice load is back-calculated from the measured response of the structure. The method requires linearity between the ice load and the response [7]. Provided that the behavior of the dam of interest is already monitored, back-calculation of the ice load is performed in three steps:


Assuming that the relationship between the ice load *I* and the structural response *u* is linear, the relationship between a change in load and response can be written as

$$
\Delta I = K\_I \Delta \mu \tag{1}
$$

where *K<sup>I</sup>* is the structural stiffness with respect to the ice load. If the dam response to an ice load is established, for example, from simultaneous measurements of ice load and dam behavior, the stiffness can be calculated directly from the inverse of Equation (1). If not, the stiffness can be estimated from a model

$$
\tilde{\mathcal{K}}\_I = \frac{\Delta I}{\Delta \tilde{u}(\tilde{I})} \tag{2}
$$

Here, <sup>∆</sup>e*<sup>I</sup>* is the model ice load, <sup>∆</sup>*u*e(e*I*) is the response, and *<sup>K</sup>*<sup>ˆ</sup> *I* is the estimated stiffness with respect to the ice load.

An observed change in the structural response, ∆ −→*u* , can be divided into three parts,

$$
\Delta \overrightarrow{u}' = \Delta u(I) + \Delta u(I^c) + \varepsilon \tag{3}
$$

where ∆*u*(*I*) is the change caused by the ice load, ∆*u*(*I c* ) is the change caused by all other loads on the dam, and *ε* is the observation error. Consequently, there are two main methods to estimate the size of the ice load from the measured signal. Either by using data where ∆*u*(*I c* ) ≈ 0 so that Equation (3) is reduced to

$$
\Delta\!\!\!\!\/(I) = \Delta\!\!\/\!\!\/+\varepsilon\tag{4}
$$

where ∆*u*ˆ(*I*) is an estimate of ∆*u*(*I*), or by estimating ∆*u*(*I c* ) so that

$$
\Delta \hat{u}(I) = \Delta \overline{u}^{\dagger} - \Delta \hat{u}(I^{\varepsilon}) + \varepsilon \tag{5}
$$

From the estimated stiffness and the estimated measured response caused by the ice load, its magnitude can be estimated as

$$
\Delta \hat{I} = \hat{\mathbb{K}}\_I \Delta \hat{u}(I). \tag{6}
$$

In this study, two methods have been used to quantify ∆*u*ˆ(*I*); one event-based approach and one residual-based approach. These two methods are presented in the following sections.

#### *2.1. Event-Based Approach*

Ice load on dams occurs as events [11,13–19], and such an event is characterized by


These ice load events predominantly last from hours up to a day, but can in some cases, last several days [11]. Figure 1 shows examples of time-histories containing three ice load events, one idealized time-history from a design guideline [20] and one measured time history, from Hellgren et al. [21]. This event-shaped time history is caused by a

combination of the loading mechanisms and the mechanical behavior of the ice. The main load-causing mechanisms for ice load on dams are restrained thermal expansions and water level fluctuations. These two mechanisms have short-term duration as the maximum variation is finite and can occur as a slow change over an extended period or a rapid change over a short period. Therefore, neither the temperature nor the water level can continuously increase or decrease for more than a limited period. The mechanical behavior of fresh-water ice is highly non-linear with a high initial creep rate. This creep relaxes the stress caused by the mechanism presented above so that the ice load continuously decreases during periods when no new load generating event occur.

With the first method, the ice load is back-calculated from events in the measured crest displacement. The ice load affects the dam behavior, and an ice load event causes the dam to deform in the downstream direction due to the increased load. By identifying displacement events in the measured signal, i.e., where the dam displaces from a local minimum to a peak, and if ∆*u*(*I c* ) ≈ 0 during such event, all displacements can be attributed to the ice load. The assumption that ∆*u*(*I c* ) ≈ 0 for dams during an ice load event as the duration of an event is insufficient for an ambient temperature change to affect the global behavior of the dam and that water level variations related to ice loads are relatively minor for the dam.

**Figure 1.** Illustration of ice load events with an idealized and real case. The idealized case is after ISO-19906 [20] and the example from Hellgren et al. [21].

A displacement event can be identified in the dam monitoring signal as the difference between a local minimum and a maximum. By doing this for the entire signal, *N* events can be identified.

$$
\Delta \overrightarrow{\mathbf{u}}^{\uparrow} = <\Delta \overrightarrow{u\_1} \cdot \cdots \cdot \Delta \overrightarrow{u\_N} > . \tag{7}
$$

After these events are identified, the maximum ice loads are estimated as follows:

$$
\Delta \hat{\mathbf{I}} = \mathbb{X}\_I \Delta \overrightarrow{\mathbf{u}}^{\prime}.\tag{8}
$$

The total ice load, **I<sup>T</sup>** is the sum of a long-term ice load, **I<sup>L</sup>** and an event ice load ∆**I**, as

$$\mathbf{I}\_{\mathbf{T}} = \mathbf{I}\_{\mathbf{L}} + \boldsymbol{\Delta}\mathbf{I} \tag{9}$$

The long-term load describes the ice load at the start of the event and is a pressure built up in the ice over the winter. Comfort et. al. [11] suggest that the long-term ice load *I<sup>L</sup>* is a function of the ice thickness, *h<sup>i</sup>* , and the ratio of the water level amplitude,

$$I\_L = 37(h\_i - 0.25) + \frac{1.47}{a/h\_i} \tag{10}$$

where *a* is the average water level amplitude over two days. Equation (10) provides the ice load in kN/m and is only valid for an ice thickness greater than 0.25 m and a ratio between *a* and *h<sup>i</sup>* greater than 0.08.

#### *2.2. Residual-Based Approach*

In the second method, the residual-based approach, a model is used to estimate the displacement of the dam caused by variation in all other factors except the ice load, **u**(*I c* ). For this, a model is used where

$$\mathbf{u}(I^{\mathfrak{c}}) = f(\mathbf{H}, \mathbf{T}, \mathbf{t}). \tag{11}$$

Here, **H** is the influence of hydrostatic pressure, **T** is the variation caused by changes in the ambient temperatures, and **t** is the irreversible changes that may occur over time. These three phenomena are the main causes of variation affecting the dam's global behavior. After the displacement caused by the complementary loads to the ice load has been estimated, the signal from the dam monitoring can be adjusted by removing these effects.

$$\mathbf{u}\_R = \overrightarrow{\mathbf{u}}' - f(\mathbf{H}, \mathbf{T}, \mathbf{t}) \tag{12}$$

For a model that perfectly describes the influence of *I <sup>c</sup>* on the dam, **u***<sup>R</sup>* contains the effect from the ice load and the measurement errors. Consequently, the total ice load is estimated as

$$
\hat{\mathbf{I}} = \mathcal{K}\_I \Delta \mathbf{u}\_R. \tag{13}
$$

Figure 2 shows the difference between the two methods used to identify displacements caused by ice loads for a fictive displacement time-series.

**Figure 2.** Illustration of the two methods used to identify displacements caused by ice loads in the observed displacements.

#### **3. Studied Dams**

This study was performed as a case study where the ice load was back-calculated using measurement data from four concrete buttress dams: Bålforsen (BFN), Storfinnforsen (SFF), Ramsele (RSE), and Rätan (RTN), and one arch dam, Krokströmmen (KRN). From the four buttress dams, data from seven independent monoliths were included in the analysis, four from Storfinnforsen and one from each of the other dams. All dams are located in Sweden.

#### *3.1. Dams*

The five studied dams are all run-off-the-river plants, with small variations between the maximum and minimum allowable water levels in the reservoir (<0.5 m). All of these

dams also have an insulation wall installed on the downstream side. These insulation walls reduces the displacements caused by seasonal temperature variations and thereby limits the risk of propagation of thermal cracks. For several of the buttress monoliths, the insulation wall was installed after the observation of through-cracks. The crack pattern for respective monolith includes several of the four crack types typically observed on buttress dams in Sweden: inclined cracks from the front plate that have propagated in the buttress toward the foundation, inclined cracks that have propagated from the inspection passage toward the front plate, and vertical cracks originating from the foundation, see [22,23].

The combination of small changes in water level and concrete temperature means that the dams are exposed to minimal external load changes other than the ice load. Therefore, these dam monoliths are suitable for back-calculations of the ice load. Table 2 provides a summary of the included monoliths. Below, a short presentation of each studied dam is given.

#### 3.1.1. Bålforsen

Bålforsen hydropower dam (BFN) is located in the river Umeälven in the northern region of Sweden and was constructed in 1958. In this study, data from concrete buttress monolith number 18 were used. The properties of the monolith are summarized in Table 2 and a sketch of the section and the locations of sensors are shown in Figure 3b. The data included from the dam are


#### 3.1.2. Rätan

Rätan hydropower dam (RTN) in the river Ljungan, is a 31 m high concrete buttress dam constructed in 1968 and located 20 km north of the geographic midpoint of Sweden, in the northern ice load region. At Rätan, ice load measurements have been performed using a load panel between 2015–2021, see in [19,21]. Monolith 15, the monolith where the load panel is attached, is included in this study. The dimension of the monolith, its prominent cracks, and the position of the sensors are shown in Figure 3a. The monolith is equipped with a hanging pendulum, that unfortunately is non-functioning. Instead, data from measurements of local displacements were used. The included data are


#### 3.1.3. Krokströmmen

Krokströmmen hydropower dam (KRN) is located in the river Ljusnan, 30 km south of the geographical midpoint of Sweden and in the northern ice load region. The concrete dam is an arch dam that was constructed in 1952. The properties of the dam are presented in Table 2 and the section of monolith 10 are shown in Figure 3h. The included data are


**Figure 3.** Dam sections for the monoliths used in the case study. The light gray line shows a sketch of the location of the cracks included in the simulation, and the dark gray shows the insulations walls. (**a**) RTN, (**b**) BFN, (**c**) SM03, (**d**) SM42, (**e**) SM44, (**f**) SM46, (**g**) RSE, (**h**) KRN. MaWL: Maximum retention water level, the abbreviations for the dam names are presented in Table 2.

This data from the external variables are available with hourly frequency from December 2016. However, the logging of the pendulum was non-functioning from November 2017. Thereby, only approximately one year of crest displacement recordings are available.

#### 3.1.4. Ramsele and Storfinnforsen

Ramsele (RSE) and Storfinnforsen (SFF) are two concrete buttress dams built in the 1950s, located 10 km apart in the Faxälven river in the the northern ice load region of Sweden. The Storfinnforsen concrete dam consists of 81 independent concrete buttress monoliths, of which four are equipped with hanging pendulums. The monoliths are, Monolith 03 (Figure 3c), Monolith 42 (Figure 3d), Monolith 43 (Figure 3e), and Monolith 46 (Figure 3f), referred to as M03, M42, M43, and M46, respectively. The concrete dam at Ramsele consists of 49 independent monoliths, where the tallest monoliths are slightly higher than 40 m. From Ramsele dam, data from Monolith 23 (Figure 3g) was included in this study.

Both of these dams have recently undergone extensive renovation and rehabilitation measures, which includes a new monitoring program and installation of post-tensioned rock-anchored tendons through the buttress wall to increase stability. New sensors have been continuously added after the start of the new program in 2019. Therefore, the availability of data from secondary measurement such as indoor temperatures and water temperatures vary but are overall sparse. The data included from the dam are


**Table 2.** Overview of basic characteristics, the data and the performed analysis for each dam.


X: Yes; -: No; \*: Year/Month; †: Not available, data from Rätan.

#### **4. Analysis Methods**

The eight monoliths were studied in two types of analyses: a pre-study that investigates the influence of previously measured ice loads on the transient behavior of the dams under normal conditions, and estimating the ice loads these dams have been subjected to from their measured structural response. The following analyzes were performed:

	- I Simulation of transient behavior without ice load
	- II Simulation of transient behavior with ice load
	- I Simulation of ice load–displacement relationship
	- II Back-calculation
		- i Event identification
		- ii Residual
			- A. Hydrostatic, Seasonal, Time (HST)
			- B. Hydrostatic, Thermal, Time (HTT)

Table 2 presents the performed analysis for each dam. A more detailed presentation of each analysis is given in the following sections.

#### *4.1. Finite Element Analyses*

Three different FE-analyses were performed: one simulation of the transient behavior of the dam without ice load (1-I); one simulation of the transient behavior with applied ice load (1-II); and one to estimate the stiffness, i.e., the ice load–displacement relationship (2-I).

All numerical analyses were performed with Abaqus [24], version 2021, using the standard implicit solver. The models used for the simulations includes the dam and part of the surrounding rock. All models are reused from the previous studies, and a more thoughtful description of the FE-models are given in the respective studies, see Table 3 for sources. The dams were modeled in 3D with 8-node linear brick elements with reduced integration and hourglass control (DC3D8 and C3D8R in Abaqus) or 6-node linear triangular prism elements (DC3D6 and C3D8 in Abaqus), see Table 3. A maximum element size of 0.3 m. was used for the dams. The elements of the rock match the dams at the interface surface with increasing size towards the outer edge of the models. In the mechanical model, the boundary conditions were applied to the rock by prohibiting displacements perpendicular to each side at all outer boundaries of the rock, except the top surface. The interaction properties for the interface between dam and foundation, and a linear elastic constitutive model were used for all materials with properties chosen according to previous studies as presented in Table 3 for each dam, respectively.

All different simulations were all performed in three steps. In the first step, the gravity load was applied to the dam. In the second step, the hydrostatic pressure corresponding to the maximum water level was applied on the upstream part of the rock and dam. The third step differed between the three analyses. In analyses 1-I and 1-II, the transient behavior of the dams was simulated with and without ice load. These simulations were performed in two domains, for temperature and mechanical equilibrium. In the temperature model, an adiabatic boundary condition was applied to all rock surfaces except the top surface. The top of the rock and the dam was divided into four categories of surfaces: outside, indoor, insulation, and water. On each surface, the corresponding measured temperature was applied using robin boundary condition. All temperatures are shown in Figure 4. Time steps of different length was used between periods with and without expected ice load. A two week time step was used for the period May to December and a six hour time time step for the period where ice loads are expected (January to April). The resulting temperature field was used as input to the mechanical model with a one-way coupling. Therefore, the resulting temperature distribution from the thermal domain was applied in the mechanical domain, and the strains and corresponding displacements from temperature variation were calculated using the same time steps.

In analysis 1-II, the transient simulation with ice load, the magnitude of the measured ice loads from [19,21] was applied as a uniform pressure on a one-meter high surface under the maximum water level. These data were used for all included dams. The data contain measurements from six winters. However, the signal is only complete from ice formation to ice break up for four of these winters, 2016/2017, 2018/2019, 2019/2020, and 2020/2021. Therefore, the recordings from these winters where used recurringly in that order to cover the period from 2012–2021. Figure 4 shows the applied ice load and Table 2 presents an overview of the performed analysis and used data.

In analysis 2-I, the ice load was applied the same one-meter high surface as in analysis 1-II in 30 kPa (30 kN/m) steps up to 300 kPa (300 kN/m). To mimic the measurement data, displacements corresponding to the measured were extracted from the simulated deformations of the dam. From these displacements, a relation between the magnitude of the ice load and the measured deformation of the dam was calculated.


**Table 3.** Material properties used for simulations of the different dams.

#### *4.2. Data-Based Models*

Data-based models were used to back-calculate the ice load with the residual approach. These models where used to create adjusted measurement series, i.e., a measuring series where the effect from all external loads (except the ice loads) were removed. There are many types of data-based models intended to predict the behavior of dams [31]. In general, databased models have a better prediction accuracy than FE-models, but are less interpretable. For this reason, two data-based models with a distinct physical coupling were chosen. The model are hydrostatic temperature and time (HTT) and hydrostatic seasonal time (HST). In the following section, a description of the implementation of these models is given.

#### 4.2.1. Hydrostatic, Seasonal, Time Model

The HST model was first introduced in [32] and has thereafter been used as a method for behavior analysis of several types of dams [33–40]. HST is a multilinear regression model, where the variation in behavior of a dam is assumed to be function of three parts: *H*, the influence of hydrostatic pressure; *S*, seasonal effects; and *t*, time-dependent effects. The model thus includes a function for each phenomenon considered to affect the global behavior of the dam and is based on the hypothesis that these three variables are sufficient to explain the variation in behavior. Furthermore, the three variables are also assumed to be independent of each other.

The response of the dam can thus be written as

$$y\_{HST} = F(H) + F(S) + F(t). \tag{14}$$

In the literature, the hydrostatic pressure is predominantly modeled with a third- or fourth-degree polynomial. However, the dams in this case study all are exposed only to small water level variations. Therefore, a simple linear relation can be assumed [41],

$$F(H) = \beta\_0 + \beta\_1 h. \tag{15}$$

where *h* is the relative water level related to the dam height *Hdam* and the reservoir level, *WL*, according to

$$h = \frac{WL - BL}{H\_{dam}} \,\text{\,\,\,}\tag{16}$$

where *BL* is the bottom level. Figure 4 shows *h* for all dams.

In HST, the seasonal variation with the first terms in a periodic Fourier series, according to

$$F(S) = \beta\_5 \sin\left(\frac{2\pi t}{L}\right) + \beta\_6 \cos\left(\frac{2\pi t}{L}\right) + \beta\_7 \sin\left(\frac{4\pi t}{L}\right) + \beta\_8 \cos\left(\frac{4\pi t}{L}\right) \tag{17}$$

where *L* = 52.18 if the time variable *t* has the unit of weeks. This type of function assumes that the behavior of the dam follows a seasonal pattern consisting of a full-year period and a half-year period.

The last effect is the irreversible changes over time. In this study, a linear relationship is assumed for this effect,

$$F(t) = \beta\_{\mathcal{Y}} t.\tag{18}$$

#### 4.2.2. Hydrostatic, Thermal, Time Model

With the HTT model, the seasonal behavior is replaced by a function that considers the actual temperature,

$$y\_{HTT} = F(H) + F(T) + F(t) \tag{19}$$

In this study, the outside air temperature *TA*, indoor air temperature *T<sup>I</sup>* , and water temperature *T<sup>W</sup>* from each dam was used.

$$F(T) = \beta\_5 T\_A + \beta\_6 T\_I + \beta\_7 T\_W \tag{20}$$

The HTT model was not used for Storfinnforsen and Ramsele as the required input data is unavailable. For the arch dam at Krokströmmen, *T<sup>I</sup>* was omitted. For those dams with several thermometers at different water depths, the temperatures recorded by the topmost thermometers were included in the data-based analysis.

#### 4.2.3. Data Preparation

From the monitoring, two type of recordings were included in this study. The first type is global measurements of the crest displacement by hanging pendulums. In this study, positive changes correspond to a movement of the crest in the downstream direction, i.e., the direction the crest displaces from an increase in ice load. The included data all have a frequency of one sample per hour. The second type of included of measurements are local displacements, recorded with crack width sensors. The data from such sensors were combined through a dimension reduction using a principal component analysis (PCA), and the first principal component (PC1) was used in the analyses. The PCA was performed on the simulated response to the ice load and results in a linear combination of the crack width signal that maximize the response caused by ice loads in the recorded signals.

For fitting of the data-based models, the temperatures and response signals were re-sampled to a time-step of two weeks. In the re-sampling, the actual measured value was used and no aggregation for the data between time steps were performed. The models were fitted on the complete time-series of re-sampled data, thus without a division into training and test set. After the fitting of each model, the residuals was calculated using the actual frequency of the recorded signals.

**Figure 4.** Input data for the analysis.

#### *4.3. Time History and Event Identification*

Ice load events were identified via a search to find local maxima and minima in the recorded signals, i.e., the time history of the pendulum and PC1 data. A local maximum or minimum was defined as an extreme value during at least 6 h. After identifying a peak in the signal, an iteration was performed to find the nearest previous and immediate following local minimum. These three points were used to define ice load events with time and magnitude for the start, peak, and end, respectively. This search was performed with the find\_peaks function from the Signal subpackage of the SciPy package [42].

#### *4.4. Calculation of Ice Load*

For all dams, the ice load was estimated using the event-based approach. From the identified events in crest displacements and PC1, all differences between the minimum value at the start of the event and the peak were assumed to be caused by the ice load. For the event-based approach, the ice thickness is a required input to Equation (10). In this study, the ice thickness was estimated using Stefans equation where the ice thickness, *h<sup>i</sup>* , is calculated from the accumulated freezing degree days (AFDD)

$$h\_i = \alpha \sqrt{\text{AFDD}\_i} \tag{21}$$

where *α* is a coefficient to account for local conditions. In this study, *α* = 2.7 was used, recommended for conditions that maximize the ice thickness, i.e., a windy lake with no snow cover [43].

For the monoliths with a data period longer than two years the ice load was also calculated with the residual approach, see Table 2. In the residual approach, all positive residuals during the winter were interpreted as an ice load.

#### **5. Result and Discussion**

This section presents and discuss the results from the case studies and the backcalculation of ice loads. It starts with a comparison of the measured dam behavior and the predicted behavior from the three model types. After that, the estimated ice load– displacement relation is presented for each dam and behavior before the identified, and back-calculated ice load events and residual ice loads are shown.

#### *5.1. Stiffness*

Figure 5 shows the calculated crest displacement as function of ice load and presents the calculated stiffness. The relationship between increased ice load and the crest displacement and is linear for all dams. The magnitude of ice load required to displace the crest 1 mm ranges between 86 kN/m for the arch dam at Krokströmmen and 254 kN/m for Bålforsen. The monoliths at Storfinnforsen and Ramsele all have similar stiffness where an ice load of approximately 160 kN/m is required to displace the dam crest 1 mm. This indicates that the load–displacement relationship is relatively constant for dams of this type. The slope of this relationship is similar also for all seven buttress monoliths, despite their different heights and shapes. If any, the results show a small negative relation between the stiffness and the dam height.

The ice load–displacement relationship is linear also for the two PC1 signals. The slope of these relationships does not have an interpretable unit, as the data was standardized before the PCA. For Rätan, the components are 0.59 and 0.80 for sensors 1 and 2, respectively. For Bålforsen, the other dam with local sensors, the weights in PC1 are −0.16, 0.82, 0.43, 0.08, and 0.35 for Sensors 1–5, respectively. Therefore, the sensors best positioned to capture effects from the ice load are sensors 2, followed by sensors 3 and 5. Sensor 2 and 3 are located orthogonal and parallel to the crack running vertically from the inspection gallery to the foundation, while Sensor 5 are located on the front plate.

**Figure 5.** Ice load–displacement relationship. The legend presents the slope for each line in the kN/m per displacement unit. (**a**) Crest displacement, (**b**) PC1 of local displacements.

#### *5.2. Time History of Displacements*

Figure 6 shows the measured and modeled time history from the three model types— HST, HTT, and FEM—for the crest displacement and PC1 of local displacements. Figure 6 shows that the ice load of the magnitude measured and applied in this project are large enough to be detected visually in the measured crest displacement. This difference is highlighted for Rätan dam as shown in Figure 6a, where only the results from analyses

1-I and 1-II, the simulations with and without ice loads, are presented. However, this effect can also be seen for Bålforsen in Figure 6b. For Krokströmmen, the only arch dam, the ice load creates increased variation in the crest movements during the winter, but the magnitudes of these variations are relatively small and not necessarily distinguishable from the noise in the signal. One possible explanation for this difference is that the ice load does not act in the downstream direction along the whole arch dam. Thus, the dam displacements from increased ice loads are not fully represented by the downstream crest displacements measured by the pendulum. Another possible explanation is the lesser insulation and lack of heating that causes larger natural crest variations for Krokströmmen compared to the other dams. Therefore, the influence from the ice load is relatively smaller and more difficult to distinguish from the influence of other loads. This difference between the displacements for simulations with and without ice loads can also be seen in the PC1 signals from Rätan in Figure 6g, but is not as prominent as for the crest displacements.

#### *5.3. Accuracy*

Table 4 presents the root mean square error (RMSE) and the coefficient of determination (R<sup>2</sup> ) for analyses 1-I, 2-II-ii-A, and 2-II-ii-B. The RMSE provides an absolute measure of fit in the units mm and kN/m, which is easier to interpret but only relevant for internal comparison. In contrast, the R<sup>2</sup> is a relative measure of fit that can be used in comparisons between dams. All FEM models capture the expected direction of the measured displacements well but do, for some periods, predict the incorrect magnitude of the variations. The two comparisons between simulated and measured crest displacements score R<sup>2</sup> values of 0.76. However, the two models differ significantly in the RMSE. For the FE model of Krokströmmen, where the RMSE for the displacement is 3.44 mm, and the stiffness (86 kN/m/mm), the RMSE corresponds to an ice load of 296 kN/m. Such magnitudes of errors make it impossible to accurately back-calculate the ice load using the residual-based approach. For the FE model of Bålströmmen, the mean error is smaller but still considerable. The FEM model of Rätan shows an adequate ability to capture the local response of the dam while the FE-model of Bålforsen shows an insufficient ability to capture the local response of the dam. For that reason, the accuracy of the derived stiffness is uncertain, and the PC1 from Bålforsen was therefore excluded from further analyses.

For both approaches, potential sources of error include the measuring accuracy of the dam monitoring equipment and estimation of the load–displacement relationship. Good quality and accuracy of the measurements are essential for the quantification of the influence of different loads in the response of a dam. The installed pendulums have an accuracy of 0.01 mm [44] which translates to ice loads of approximately 0.8–2.5 kN/m based on the stiffness presented in Section 5.1. The error on the estimated ice load is directly proportional to the error of the estimated stiffness. For the crack width sensors, the accuracy is 0.005 mm and the resolution 0.00125 mm [45]. The observed variation between winter and summer for the crack with sensors with the largest variation is approximately 0.4 mm. Thus, the relative accuracy is lower for the crack width sensor than for the pendulum. This can be observed in the time history where the crest displacement signal is more smooth than the PC1 signal. The lack of resolution means that the PC1 signal varies between two discrete levels during some periods. This variation is sometimes incorrectly classified as events, which could have been avoided with higher resolution.

In the residual approach, the assumption is that the models can be used to remove the influence from all loads except the ice load from the measured signal. The model with the best prediction accuracy in this study was HTT. This model was used in the residual approach to create an adjusted signal where effects from temperature, water level, and time were removed. The HST provides R<sup>2</sup> values in the range 0.7–0.9, while R<sup>2</sup> for the HTT is over 0.89 in all comparisons. Thus, the linear combination of temperatures, water level, and time that is HTT can explain over 90% of the variance in displacements. These results can be compared to the work in [40] where a HST model was used on the buttress dams at Ancipa (R<sup>2</sup> = 0.95–0.97), Sabbione (R<sup>2</sup> = 0.95–0.96), and Malga Bissina (R<sup>2</sup> = 0.88–0.91, 0.92,

0.92, 0.90), and the results from in [39] where crest the crest displacement of a buttress dam was predicted with HST (R<sup>2</sup> = 0.95) and HTT (R<sup>2</sup> = 0.96). Thus, the prediction accuracy for the models in this study is similar to previous studies. However, the result presented in Table 4 shows that despite the high accuracy of the data-based models, the RMSE as ice load is relatively large. For the four predictions of crest displacements, the RMSE is between 27 and 57 kN/m. Furthermore, as shown in Figure 7, the errors are similarly distributed during both summer and winter, with only a slight tendency for the average errors to be greater during the winter. Furthermore, the most significant errors and ice loads are similarly frequent during summer and winter. Thus, the accuracy of the ice loads calculated with the residual approach was deemed unreliable. Therefore, only the ice loads from the event-based approach were included as the final results.


**Table 4.** The root mean squared error (RMSE) and R<sup>2</sup> for the three analyses for crest displacement and PC1.

† RMSE <sup>×</sup> *<sup>K</sup>*<sup>ˆ</sup> *I* .

#### *5.4. Time History of Ice Loads*

Figure 8 shows a time history of the total ice load calculated with the event-based approach for both crest displacement and the first PC1 of the crack width sensors. The figure shows two main results: that the calculated ice loads are small and and that the deviation in crest displacement during periods with expected ice loads are similar to those from the rest of the year. In general, the ice loads are low and are combined for all dams other than Rätan, only a few occasions of ice loads in the vicinity of 100 kN/m. In comparison with the expected response of the dams shown in Figure 6, the ice loads and underlying displacements are small.

Figure 8 also shows that events occur during all months and that the most prominent occur during periods with no ice in the reservoir. Figure 7b show histograms of the identified ice load events during summer and winter. For a dam exposed to significant ice loads, the expected result is that both events will be greater and more frequent during January to April when ice loads are expected to occur. Such a trend is not visible neither in the events nor in the residuals or the resulting ice loads. A t-test for difference in the means of the events' magnitude between winter and summer shows that the upper limit of the 95% confidence interval for the difference is from −2.2 kN/m for SM03 to 4.2 kN/m for Bålforsen, where a positive value means a larger mean for the winter period. There are two implications of these results:


**Figure 6.** Measured and modeled time history for the eight included dams: crest displacements; (**a**) RTN, (**b**) BFN, (**c**) SM03, (**d**) SM42, (**e**) SM44, (**f**) SM03, (**g**) RSE, (**h**) KRN; and the first principal component of the crack width measurements; (**j**) BFN and (**i**) RTN.

The first implication is a problem for the accuracy of the method and the estimated loads. Several improvements could be made to the classification algorithms, such as filtering the signal based on frequency or applying post-peak criteria. However, this study applied a conservative approach to ensure that the ice loads were not underestimated. In the event-based approach, all displacements that occur during an event was assumed to

be caused by ice loads. During a thermal ice load event, the air temperature increases and warms the downstream part of the dam. When the temperature in this downstream part increases, thermal expansion causes a crest displacement in the upstream direction. Such displacement is in the opposite direction to a displacement caused by ice loads and could thereby conceal the influence from this load. However, the high thermal mass of concrete results in slow thermal expansion. During an event caused by water level variation, the hydrostatic pressure will also vary from the changed water level. However, this difference is also negligible as any significant water level change will break the ice sheet.

**Figure 7.** Distribution of events and residuals and during the summer (June to September) and ice load season (January to April). (**a**) Residuals. (**b**) Events. For visibility, have 15 events from BFN-PC1 with magnitudes over 50 kN/m and 4 events from RTN-PC1 with magnitudes over 200 kN/m been removed. None of the removed events occurred during a winter.

The presence of ice loads of similar magnitude and frequency during winter and summer implies that several of the events inferred as ice loads should be attributed to noise in the measured signal. Therefore, the magnitude of ice loads presented in this study is most likely overestimated. Furthermore, the static equilibrium requires the dam the deform in response to the ice load. The absence of clear evidence of such responses are a indication that the dams have not been subjected to any major global ice loads.

#### *5.5. Annual Max of Ice Loads*

Table 5 presents the maximum total ice load from each winter (Jan–April) for all included monoliths. This table shows the results without considering the reservations discussed above. The results can be divided into three categories: The first category is the single winter from the arch dam at Krokströmmen, where the maximum magnitude was 83 kN/m. The second category is the 20 winters from the remaining buttress dams. For these dams, the data from the crest displacement measurements indicate that the maximum ice load event for a typical winter is between 50 and 100 kN/m, with two occurrences of ice loads over 100 kN/m.

**Figure 8.** Time history of the total ice load (event + long-term) identified and back-calculated from events in the crest displacements and the first principal component of the crack width sensors. (**a**) RTN, (**b**) BFN, (**c**) KRN, (**d**) SM03, (**e**) SM42, (**f**) SM44, (**g**) SM03, (**h**) RSE.

The third category is the eight winters of PC1-data from Rätan. During these winters, the estimated ice loads are between 67 and 205 kN/m. For seven of the eight winters, the maximum estimated ice load exceeds 100 kN/m. The results from Rätan Dam can be compared with ice loads measured with a load panel on the upstream face of the dam [19,21]. The measured ice loads with the load panel are presented in Figure 8. During the six seasons of measurements, the 1 m wide and 3 m high load panel recorded 73 ice load events with a magnitude greater than 75 kN/m and an overall maximum of 200 kN/m. During the eight winters in this study, which includes the six winters with measurements, the maximum ice load estimated from the response of the dam is 205 kN/m and 52 events greater than 75 kN/m was identified. Only two of the events occur simultaneously, i.e., ice loads were registered both by the panel and in the dam's response. These two events occurred on 5 and 19 February 2020. On the first occasion, an ice load of 111 kN/m was measured with the panel, while the estimated total ice load from the dam's response was 77 kN/m. The corresponding loads for the second occasion are 99 and 78 kN/m.

Traditionally, the ice load has been measured with a sensor in the ice or on the dam face. Previous measurements campaigns with such sensors have recorded ice load with significantly larger magnitudes than the design loads and those estimated in this study, e.g., 780 kN/m from one season at the Beaumont dam [46], 600 kN/m from one winter with four stress cell panels at the La Gabelle dam [46], 370 kN/m from five winters at Seven sisters dam [11], 290 kN/m from four winters at the Eleven Mile Canyon dam [47],

270 kN/m from five winters with three to four panels at the Tradalsvik dam [48], 200 kN/m from one winter with eleven panels at the Barrett Chute dam [49], and 200 kN/m for six winters with the load panel, and 720 kN/m from one winter with three stress cell panels at the Rätan dam [19].

**Table 5.** The annual maximum of total ice load, event ice load, long-term ice load, ice thickness, accumulated degree freeze days (AFFD), the minimum temperature, and the mean water level amplitude. Note that the time of the different extreme values for the different variables during a winter does not necessarily occur simultaneously.


\* From crest-displacement data, see Section 5.3. † Data not available for the full winter.

One possible explanation for the small ice loads estimated in this study is scale effects, i.e., that local ice loads on a small area can be significantly larger than the mean of the global ice loads on the whole structure. This phenomenon is considered in design standards for piers, off-shore structures, and ships [20,50,51], but not in design guidelines for dams [1–5]. Ice is a material whose behavior and strength are scale-dependent. A compilation of 2073 freshwater ice beam tests shows that the scale-dependent flexural strength in kPa is proportional to *V* <sup>−</sup>0.13 [52], where *V* is the volume of the beam. For ice–structure interaction, the same general relation also applies, and the average ice load decreases as the area of interaction increases. The relationship between volume and the flexural strength

for freshwater ice presented in [52] gives that flexural strength for an ice volume with the width of 8 m (a dam monolith) is 7% of that of a 1 m wide volume (the load panel). The results from this study indicate that the scale effect is applicable also for dam–ice interaction and that further studies is needed to investigate how the ice loads vary along the dam and between different scales.

#### *5.6. Practical Implications*

The Swedish guidelines for ice loads on bridge piers and dams are essentially the same recommendations as given in the first guidelines from 1931 [53]. The foreword to the latest revision of the recommendations for ice loads on bridges in Sweden states that these first ice loads recommendations were created based on no actual knowledge but has gained empirical validation [54]. There are simultaneous indications that the ice loads in design guidelines are either over- or underestimated. The results from previous existing measurements and the theoretical models indicate that the current guidelines underestimate the magnitudes of ice loads. However, a common result in assessments of concrete dams in cold regions is insufficient stability for load cases that includes ice loads. For this reason, several Swedish dams has been rebuilt or strengthen to ensure sufficient stability. Despite this, there is a general public opinion among dam engineers that the current guidelines are too strict. The use of these design magnitudes worldwide in design of dams for almost a century has resulted in few incidents and no failures reported as induced by ice loads.

For the design of concrete dams, the relevant load is the global ice load, i.e., the average load on the width of the structure. This ice load is not a load in the pure sense but rather a restraint force where the ice load is the restrained movement or expansion of the ice sheet. A dam subjected to ice loads will deform until a new equilibrium position is reached. Ice loads of the magnitude in current design guidelines constitute a significant portion of the the total horizontal load on most dams. These ice loads' magnitude combined with the long lever arm from the foundations theoretically causes a large structural response. This large expected influence of ice loads is also shown in the results from analyses 1 (I and II) and 2-II in this study.

The most probable interpretation of the results of this study is that no or very small traces of ice load can be observed in the measured response of most dams. These results are limited to the eight studied dam monoliths. However, as the ice load was included as a design load for dams approximately hundred years ago without any theoretical basis, ice loads has not caused any major incidents. Therefore, a main focus of future investigation should be to find full-scale empirical evidence of existence of global ice loads on dams of a relevant magnitude. It is likely that such evidence can be found, but before this, it is recommended that studies are performed to investigate that ice loads on concrete dams can pose a dam safety issue.

Such investigations should primarily focus on finding influence of ice loads in the structural behaviour of concrete dams. Using measurements from the dam monitoring to estimate the magnitude of the ice load is a cost-effective method that has advantages from both a scientific and dam safety perspective. From a scientific perspective, this type of measurement provides a cheap method that is readily accessible for many dams. The method also facilitates the use of data already collected, as long as it was sampled with a sufficiently high frequency. Therefore, the method presents an opportunity to rapidly expand the empirical data on the impact of ice loads on concrete dams. From a dam safety perspective, the installation of a global displacement sensor such as a hanging pendulum have several benefits. For dams where the dam owner wants to examine or monitor the magnitude of the ice load, such a method is therefore very advantageous.

#### **6. Conclusions**

This study estimates the magnitude of the ice load on an arch dam and seven monoliths from four concrete buttress dams, based on measured structural response during 29 winters. The main results are the magnitudes of ice loads calculated using identified displacement events in the measured signals. With this event-based approach, the loads were estimated from identified displacement events in the measured signal and interpreted as ice loads using a displacement–ice load relationship derived from FE simulations. The results from the FE simulations show that ice loads of magnitudes from traditional ice load measurement sensors and the applicable design guideline should significantly affect the studied dam structural response and be detected by traditional dam monitoring. Nevertheless, only small traces of ice loads can be found in the observed response of the studied dams.

The annual maximum magnitudes of ice loads estimated in this study are significantly smaller than those recorded during ice loads measurements sensors and in design guidelines. However, the results show that displacement events occur with similar frequencies and magnitudes during all months. Therefore, the conservative assumption used in this study, that ice loads cause all displacements during events, is likely to cause an overestimation of the ice loads. Therefore, the event-based approach applied in this study provides a conservative estimate of the magnitude. The true magnitude of the ice loads these dams have been subjected to is most likely even lower.

This study involves modeling the structural behavior of several concrete dams located in cold climates. Therefore, some auxiliary conclusions regarding the interpretation and modeling of the behavior of the dams in this study, adjacent to the primary research question for this study, are:


This study demonstrates the need to further investigate the relationship between stresses measured in the ice sheet, pressures measured at the dam–ice interaction face, and the dam's response. Further research should be undertaken to investigate the influence of ice loads on the structural behavior of concrete dams. The event-based method used in this study is fast and straightforward and, therefore, suitable for applying on other dams. Such studies could rapidly increase the empirical knowledge regarding the magnitudes of ice loads on dams.

**Author Contributions:** Conceptualization, R.H. and R.M.; methodology, R.H.; software, R.H.; validation, R.H.; formal analysis, R.H.; investigation, R.H.; resources, R.M. and A.A.; data curation, R.H.; writing—original draft preparation, R.H.; writing—review and editing, R.M., J.E., A.A. and E.N.; visualization, R.H.; supervision, R.M., A.A. and E.N.; project administration, R.M.; funding acquisition, R.M. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Acknowledgments:** The research presented was carried out as a part of "Swedish Hydropower Centre—SVC". SVC has been established by the Swedish Energy Agency, Energiforsk and Svenska Kraftnät together with Luleå University of Technology, KTH Royal Institute of Technology, Chalmers University of Technology and Uppsala University (www.svc.nu accessed on 7 February 2022).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Seismic Safety Assessment of Arch Dams Using an ETA-Based Method with Control of Tensile and Compressive Damage**

**André Alegre 1,2,\* , Sérgio Oliveira <sup>2</sup> , Paulo Mendes <sup>1</sup> , Jorge Proença <sup>3</sup> , Rafael Ramos 2,3 and Ezequiel Carvalho <sup>4</sup>**


**Abstract:** The seismic safety assessment of large concrete dams remains a significant challenge in dam engineering, as it requires appropriate analysis methods, modern performance criteria, and advanced numerical models to simulate the dam seismic behavior. This paper presents a method for seismic safety assessment of arch dams based on Endurance Time Analysis (ETA), using tensile and compressive damage results from a robust formulation for seismic analysis considering joint opening/sliding and concrete non-linear behavior (finite element program *DamDySSA*, under development in LNEC). The seismic performance is evaluated by controlling the evolution of the damage state of the dam, according to predefined performance criteria, to estimate acceleration endurance limits for tensile and compressive damage. These acceleration limits are compared, respectively, with the peak ground accelerations prescribed for the Operating Basis Earthquake (OBE) and Safety Evaluation Earthquake (SEE), aiming to evaluate the dam seismic performance relative to both earthquake levels efficiently, using a single intensifying acceleration time history. The ETA-based method is applied to the cases of Cabril Dam (132 m-high) and Cahora Bassa Dam (170 m-high), confirming its usefulness for future seismic safety studies, while the potential of *DamDySSA* for non-linear seismic analysis of arch dams is highlighted.

**Keywords:** arch dams; seismic safety; endurance time analysis; non-linear seismic analysis; concrete damage model; tensile and compressive damage

#### **1. Introduction**

#### *1.1. Framework and Motivation*

Large concrete dams are civil engineering structures with significant social, environmental, and economic impact. In fact, dams play a key role in the proper management of freshwater resources, namely for water supply, flood control, soil irrigation, and energy production, and they have become vital to populations and societies [1,2], not only due to the global increase in water demand, but also because of climate change [3]. Most dams are structures of high potential risk, since incidents or accidents may result in significant losses for populations and the environment [4]. For these reasons, dam engineers must ensure the best operational conditions and the structural safety of dams in normal service conditions and during exceptional events, under both static and dynamic loads.

In this context, emphasis should be given to the fact that many of the major concrete dams in operation or under construction are located in regions of high seismicity [5], and strong earthquakes can cause unacceptable joint openings or significant concrete damage that may require service interruption or even compromise structural integrity, among other

**Citation:** Alegre, A.; Oliveira, S.; Mendes, P.; Proença, J.; Ramos, R.; Carvalho, E. Seismic Safety Assessment of Arch Dams Using an ETA-Based Method with Control of Tensile and Compressive Damage. *Water* **2022**, *14*, 3835. https:// doi.org/10.3390/w14233835

Academic Editor: M. Amin Hariri-Ardebili

Received: 21 October 2022 Accepted: 21 November 2022 Published: 25 November 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

incidents [6–9]; for example, in China, one of the regions with the highest seismic activity in the world, sophisticated experimental and numerical studies were carried out to support the design of new ultra-high concrete dams [10–13]. On the other hand, most of the older dams were built many decades ago, and thus designed using unreliable seismic analysis methods and outdated performance criteria. As such, the need to reassess the seismic safety of older dams based on modern practices has been identified; for example, in Switzerland, the seismic safety reassessment of all 208 large dams according to modern standards was required and carried out based on current approaches [14].

Therefore, appropriate methodologies and models should be developed to analyze the dynamic response of dams in normal operating conditions, aiming to control structural integrity over time and to evaluate the structural safety under seismic loads. The permanent structural health monitoring of dams can be conducted based on ambient/operational vibration analysis methods, to detect modal parameter variations that can be correlated with loss of stiffness, while the response during seismic events can be monitored, e.g., by controlling quantities of interest or by comparing the measured acceleration time histories with the response predicted using numerical models [15–17]. For seismic safety assessment of dams, the performance under strong earthquakes should be evaluated based on reliable methods of analysis and on modern performance criteria [8,9], using advanced models to simulate the dynamic response of dam–reservoir–foundation systems, with the possibility of modeling non-linear structural behavior, and using suitable seismic inputs, which are essentially models of earthquake ground motion [18,19].

#### *1.2. Objetives and Contributions*

Dam safety assessment, namely under seismic loads, has become a fundamental component to ensure the overall safety of large concrete dams [9], with a view to meet the increasingly demanding requirements in terms of structural safety, and then to respond to the main concerns of dam owners and entities responsible for dam safety control [20–22]. To contribute at the level of the methodologies for seismic safety assessment of concrete dams, the main objective of this paper is to present a method based on Endurance Time Analysis (ETA), using tensile and compressive damage results from sophisticated non-linear seismic simulations with joint movements and concrete deterioration.

Although ETA-based procedures have already been used by other researchers for evaluating the dynamic capacity of dams under intensifying earthquakes [23–29], the innovation presented in this work resides in the adopted approach for evaluating the seismic performance of the dam: essentially, this is conducted by controlling the evolution of the damage state of the dam, considering suitable performance criteria, in order to estimate acceleration endurance limits associated with both tensile and compressive damage. Additionally, using a single intensifying acceleration time history, the proposed approach enables an efficient evaluation of the seismic performance of the dam in relation to the earthquake levels required in current guidelines [20–22], namely by comparing the endurance limits for tensile and compressive damage with the peak ground accelerations prescribed for the Operating Basis Earthquake (OBE) and the Safety Evaluation Earthquake (SEE) of the dam site.

Another innovation achieved is the way the non-linear seismic response of arch dams is simulated, using the finite element program *DamDySSA*. This program, under development in LNEC for several years, includes a robust coupled formulation for nonlinear seismic analysis of dam–reservoir–foundation systems, simultaneously considering (a) the structural effects due to the opening/sliding joint movements, using a non-linear joint model, and (b) the tensile and compressive damage in concrete, based on an isotropic constitutive damage model with two independent damage variables and softening.

The ETA-based method is applied to the cases of Cabril Dam (132 m-high) and Cahora Bassa Dam (170 m-high). Overall, high acceleration endurance limits associated with tensile damage and compressive damage were determined for both dams, allowing to show their good seismic performance.

#### **2. On the Seismic Safety of Concrete Dams**

Although significant developments have been achieved in this field, seismic capacity assessment must be carried out based on sensitive analyses of results and engineering judgement [28,30–32], with a view to meet the requirements and the performance criteria defined in the guidelines of the International Commission on Large Dams (ICOLD) and in the regulations of each country. This section presents an overview on several topics of interest for seismic safety of concrete dams.

#### *2.1. Models for Non-Linear Dynamic Analysis of Dam–Reservoir-Foundation Systems*

Large concrete dams are usually structures with unique and complex geometry, particularly in the case of arch dams, and they may have different types of discontinuities, including construction joints or concrete cracking. Furthermore, the dynamic behavior of dams is strongly influenced by the interaction with the reservoir and the foundation [33,34]. Therefore, for simulating the dynamic behavior of concrete dams it is essential to develop robust models of dam–reservoir–foundation systems (Figure 1), considering the specific features of the dam structure and multiple dynamic effects, such as dam–water interaction, pressure wave propagation in the reservoir domain, dam–foundation interaction, the behavior of the rock mass, and damping mechanisms, including viscous damping in the dam and radiation damping in the reservoir and foundation; these are factors that can have a significant influence in the overall seismic behavior of the dam [18,19].

To simulate the reservoir and dam–reservoir dynamic interaction, there is the classic added water mass model, using a displacement-based formulation for the dam and foundation and considering the reservoir mass effect based on the solution proposed by Westergaard [35]. Although simple and efficient, this model neglects water compressibility and hydrodynamic effects on curved and flexible dams [36,37], and the added mass effect is overestimated for arch dams [19]. Therefore, a better solution is to use a coupled model based on a finite element formulation for simulating the behavior of the solid (dam and foundation) and fluid (reservoir) domains [38]. In this case, it is common to consider a formulation in dispalcements for the solid, and in hydrodynamic pressures or in velocity potentials for the fluid, considering proper boundary conditions to simulate dam motion– water pressure coupling and the reservoir pressure wave propagation with radiation at the far-end boundary [38–40]. A coupled formulation in displacements and pressures was implemented in the finite element program *DamDySSA*.

Regarding the foundation behavior, the massless models hypothesize a deformable foundation block with a rigid boundary at the base, neglecting the wave propagation effects and radiation damping [41]. For massless models, the substructure method is usually employed to compute an elastic and massless substructure foundation block, considering an equivalent stiffness matrix condensed at the dam–rock interface [41,42]; the seismic input can be applied as uniform or spatially variable ground motion [43,44]. Alternatively, energy dissipating models consider the foundation mass and enable the simulation of wave propagation and radiation effects in the rock mass [34,45,46]. The seismic input can be obtained using equivalent force schemes, to generate the ground motion at the dam–foundation surface [47,48], or by performing deconvolution analysis and then using compression/shear waves propagating from the foundation base [40,45]. Neglecting the foundation inertia and damping can result in an overestimation of stresses in the dam body [45,49–51], if no additional damping source is considered. However, for substructure massless models, a damping matrix proportional to the foundation stiffness matrix can be introduced at the dam–rock interface, as considered in *DamDySSA*.

With regard to the seismic behavior of arch dams, under low intensity earthquakes, commonly measured on dams, low-amplitude movements are expected, and thus the numerical simulations can be carried out assuming linear-elastic behavior for concrete and considering that joints in the dam body remain closed [15,52–54]. On the other hand, under high intensity earthquakes, vibrations of greater amplitude and hence larger deformations may occur, resulting in the opening of the vertical contraction joints [55,56], and, at the

same time, in high tensile and/or compressive stresses that might cause concrete cracking or crushing [57,58]. Therefore, for non-linear seismic analysis of arch dams, appropriate constitutive models should be used in order to simulate both the structural effects due to joint movements and the nonlinear behavior of concrete up to failure under tension and compression [59–63]. In *DamDySSA*, a robust formulation is implemented for non-linear seismic analysis, considering a non-linear joint model, to simulate opening/closing and sliding movements, and an isotropic constitutive damage model with softening and two independent damage variables for tension and compression.

#### *2.2. Ground Motion*

For non-linear seismic analysis, carried out using time domain procedures, the seismic input is defined by means of acceleration time histories. Preferably, the earthquake ground motion should be represented by real accelerograms, recorded for site conditions similar to those of the dam to be analyzed [64]. Nevertheless, currently available records of strong earthquakes are not enough to cover the range of possible conditions, meaning that artificial acceleration time histories are needed [20]. Ideally, these accelerograms should be obtained based on the design ground motion parameters from specific seismic hazard studies conducted for the site of the dam to be analyzed.

Suitable acceleration time histories for dynamic response analysis can be generated from response spectra [65], considering specific features of the horizontal and vertical components and spatial variation effects (Figure 2a). Several methods have been developed for generating realistic seismic acceleration time histories, including the classic stochastic stationary procedure implemented in the computer program SIMQKE [66], the so-called Stochastic Method developed by Boore [67], the stochastic fault rupture and seismic wave propagation model developed by Carvalho [68], and the method for generating nonuniform ground motion using transfer functions proposed by Alves [47]. A different approach that has been gaining popularity in dam engineering is to consider endurance time excitation functions (ETEFs), developed by Estekanchi [69] and later optimized [70,71], which enable the generation of artificial intensifying acceleration time histories (Figure 2b) for ETA-based procedures.

Generated acceleration time histories, especially from ETEFs, may be quite different from acceleration records of real earthquakes, as they are models of the seismic load. Even so, by applying appropriate methods of analysis, the use of generated accelerograms as

the seismic inputs in advanced numerical models of dam–reservoir–foundation systems will lead to a safe seismic design and to an adequate seismic safety assessment, which is essentially the main goal [72].

**Figure 2.** Seismic acceleration time histories: (**a**) design spectrum compatible generated acceleration time history; (**b**) intensifying dynamic excitation produced based on an ETEF.

#### *2.3. Methods of Analysis*

The studies for seismic safety assessment of dams are usually conducted by performing multiple time history seismic analyses, using appropriate acceleration time histories, to simulate both common service scenarios and failure scenarios. The seismic response analyses can be carried out using one or multiple (with distinct frequency content) generated accelerograms, which are scaled to obtain seismic loads with various peak ground accelerations, corresponding to weaker or stronger ground motions. This methodology is according to the principals of Incremental Dynamic Analysis [73], and it has allowed researchers to obtain valuable results and draw assertive conclusions on the seismic capacity of large concrete dams; several application examples can be found in studies for gravity dams [61,74] and arch dams [75,76]. The obvious disadvantage is the need to conduct multiple calculations for different seismic excitation levels, resulting in longer calculation times to investigate different scenarios, particularly if non-linear analyses are required.

Another approach that has been used more frequently over the past decade is to evaluate the seismic capacity of concrete dams based on ETA [23–29], which is essentially a seismic analysis pushover procedure for seismic performance assessment under a predesigned intensifying dynamic excitation. The aim of ETA is to subject the dam to dynamic vibrations from a low excitation level, where the structural response is within the linear domain, to medium excitation level, as structural non-linearities start to occur, and finally to a high excitation level, ultimately causing dam failure. Therefore, the evaluation of the seismic performance of the dam is performed in a single time history analysis, by controlling the response in multiple time steps along the process, considering one or several engineering demand parameters [24,29]. The ETA-based approach enables a good assessment of the seismic capacity of dams, and it is also highly efficient when compared to the traditional approach, as it significantly reduces computational demands. ETA procedures may also be of great use, e.g., for dynamic shape optimization [28].

#### *2.4. Seismic Design and Performance Criteria*

The general guidelines for seismic analysis, design and safety assessment are documented in bulletins [20] from the Committee on Seismic Aspects of Dam Design of ICOLD, as well as in the specific regulations of the National Commissions on Large Dams of its various member countries. In what concerns methods of analysis, it is possible to adopt simplified evaluation procedures based on linear seismic analysis for seismic performance evaluation. However, according to modern performance criteria, structural non-linearities are acceptable to a certain extent under severe earthquake levels. Thus, in current practice, non-linear seismic analysis procedures are considered, enabling researchers to investigate the non-linear structural behavior of dams towards collapse. With respect to the selection

of seismic parameters for large dams [20], there are two main levels to be considered for seismic design and safety assessment: the Operating Basis Earthquake (OBE) and the Safety Evaluation Earthquake (SEE).

The OBE is the earthquake that may be expected to occur during the lifetime of the dam, with a minimum return period of 145 years (probability of occurrence of about 50% during a service life of 100 years). The OBE indicates an earthquake level under which significant damage or loss of service must not occur. The seismic performance criteria for the OBE can be verified based on linear-elastic dynamic analyses, by evaluating stresses and deformations, or through rigid body sliding and overturning stability analysis. For example, these types of verifications were conducted for the case of Luzzone Dam, in the scope of an ICOLD International Benchmark Workshop [77,78].

The SEE is the governing earthquake event for seismic design and safety assessment of the dam and safety-relevant components. The SEE can be taken as the Maximum Credible Earthquake (MCE), which is expected to induce the largest ground motion at the dam site, considering the seismic history and seismotectonic setup in the dam region, or as the Maximum Design Earthquake (MDE), with a return period of 10,000 years, using suitable ground motion parameters. Under the SEE, structural damage is acceptable to a certain extent, however the dam must resist without (a) structural integrity being compromised or (b) uncontrolled release of the reservoir, to avoid flooding in the downstream region. The performance criteria assessment under the SEE requires non-linear dynamic analysis methods, in order to investigate the main failure modes. For example, for arch dams, these failure modes include concrete crushing in key areas, under high compressive arch stresses, leading to the loss of bearing capacity in the arch direction, and the local sliding or overturning stability of blocks at the crest, due to large movements in the upstream direction; required modeling results may include inelastic deformations and joint movements, stresses, and tensile and compressive damages.

#### **3. ETA-Based Method for Seismic Safety Assessment of Arch Dams**

This paper presents a methodology for seismic safety assessment of arch dams based on ETA, using tensile and compressive damage results obtained in non-linear seismic simulations that consider the effects due to joint opening/sliding movements and the concrete non-linear behavior (see Section 4 about the program *DamDySSA*). The proposed approach for evaluating the seismic performance of the dam consists in an intuitive analysis, carried out by controlling the evolution of the damage state of the dam, considering appropriate performance criteria.

The main goal is to determine two endurance limits, one associated with tensile damage (*t* + ; *a<sup>d</sup>* + ) and the other with compressive damage (*t* <sup>−</sup>; *a<sup>d</sup>* −), which correspond to the duration or the respective acceleration level of an intensifying seismic load that the dam can withstand without presenting unacceptable levels of damage. In practice, the endurance limits are determined for two excitation levels, *a<sup>d</sup>* <sup>+</sup> and *a<sup>d</sup>* −, which correspond to the maximum acceleration values of the intensifying seismic action that originate acceptable damage states according to specific criteria defined for tensile, and compressive damage, respectively. In this way, it is expected for concrete cracking under tensions to become excessive after *a<sup>d</sup>* + (repair interventions required), and for compressive damages to increase until concrete crushing occurs in key areas of the dam after *a<sup>d</sup>* − (collapse scenario).

The adopted performance criteria are related to the extent of concrete damage on the dam body, with a view towards meeting the requirements defined for large dams under the OBE and the SEE in the current seismic design and safety guidelines [20]. Essentially, in terms of tensile damage, the occurrence of concrete cracking in significant areas of the upstream and/or downstream surfaces of the dam is considered unacceptable, particularly if there is cracking propagation across the thickness of the cantilevers, since this damage state could affect the structural integrity of the dam and require repair interventions—this scenario would not meet the dam safety requirements for the OBE. As for compressive damage, the occurrence of concrete crushing caused by compressive damage in key areas

of the dam, e.g., in the upper blocks of the main cantilevers, is considered unacceptable, namely if there is propagation within these blocks, as this scenario could ultimately result in collapse and hence in an uncontrolled release of water from the reservoir—this would not comply with the dam safety requirements under SEE levels.

Finally, with the proposed approach, the seismic performance of the dam in relation to both earthquake levels is efficiently evaluated based on a single time history analysis, by comparing the acceleration limits for tensile and compressive damage with the peak ground accelerations prescribed for the Operating Basis Earthquake (OBE) and the Safety Evaluation Earthquake (SEE) of the dam site. In summary, the structural safety of the dam under seismic loads is ensured for the OBE, if *aOBE* < *a<sup>d</sup> +* , and for the SEE, if *aSEE < a<sup>d</sup>* −.

The application of this method includes essentially three phases (Figure 3), namely:


**Figure 3.** Schematic representation of the proposed methodology for seismic safety assessment of dams based on ETA.

#### **4. Used Finite Element Program (***DamDySSA***)**

*DamDySSA* is a 3D finite element program for dynamic analysis of concrete dam– reservoir–foundation systems. Several years in development in LNEC, the latest version of the program includes calculation modules for modal analysis, linear seismic analysis, and non-linear seismic analysis [62]. Figure 4 shows the graphical user interface designed for the program in MATLAB. The coupled finite element formulation and the numerical method for non-linear seismic analysis are summarized in the following sub-sections.

**Figure 4.** *DamDySSA:* finite element program for dynamic analysis of concrete dams. User interface.

#### *4.1. Dynamic Behavior of the Dam–Reservoir-Foundation System: Finite Element Formulation*

The dynamic behavior of the dam–reservoir–foundation system is simulated based on a coupled model in displacements (dam–foundation) and in hydrodynamic pressures (reservoir) [38]. Specific boundary conditions are prescribed at the main interfaces of the solid-fluid system, to consider dam–water dynamic interaction, the propagation of pressure waves in water with radiation at the far end of the reservoir, and the reservoir free surface effect [79]. In *DamDySSA*, a true coupled approach is adopted to solve the dynamic problem without separating the solid and fluid domain equations [62]. Therefore, the finite element equation of the dam–reservoir–foundation system is simply written

$$\mathbf{M} \ddot{\mathbf{q}} + \mathbf{C} \dot{\mathbf{q}} + \mathbf{K} \,\mathbf{q} = \mathbf{f}, \quad \mathbf{q} = \begin{bmatrix} \ddot{\mathbf{u}} \\ \ddot{\mathbf{p}} \end{bmatrix} \tag{1}$$

where M− , C − and K − are the global mass, damping and stiffness matrices, *F* ˜ = *F* ˜ (*t*) is the global nodal force vector, and q − = q − (*t*) is the coupled unknown vector. These coupled variables are defined as follows

$$\mathbf{^M\_{-}=} \begin{bmatrix} \mathbf{m} & \mathbf{0} \\ \rho\_w \mathbf{Q}^T & \mathbf{\bar{S}} \end{bmatrix}; \mathbf{^C\_{-}=} \begin{bmatrix} \mathbf{c} & \mathbf{0} \\ \mathbf{0} & \mathbf{\bar{R}} \end{bmatrix}; \mathbf{\bar{K}} = \begin{bmatrix} \mathbf{k} & -\mathbf{Q} \\ \mathbf{0} & \mathbf{H} \end{bmatrix}; \mathbf{\bar{F}} = \begin{bmatrix} \mathbf{f}\_{\cdot} \\ \mathbf{f}\_{w} \\ \mathbf{f}\_{w} \end{bmatrix} \tag{2}$$

where u ˜ = u ˜ (*t*) is the displacements vector for the dam nodes (three degrees of freedom for each node), while p = p(*t*) is the hydrodynamic pressures vector for the reservoir nodes (each with a single pressure degree of freedom). The mass, damping and stiffness matrices are m− , c − and k − , for the solid domain, and S − , R − and H − , for the fluid domain; the coupling matrix for water pressure–structure motion coupling is Q − . The nodal force vectors in the solid and fluid domain are *F* ˜ *s* = *F* ˜ *s* (*t*) and *F* ˜ *w* = *F* ˜ *w* (*t*), respectively; the forces in the dam may include the dam self-weight, the hydrostatic pressure on the upstream face, and the dynamic loads. Generalized damping is assumed, with natural viscous damping calculated element by element in the solid domain and energy dissipation due to radiation in the reservoir domain. The substructure method [41] is used to model the foundation block as an elastic and massless substructure, considering equivalent stiffness and damping components incorporated in the dam–rock interface; consequently, the seismic input is applied directly at the dam base, assuming uniform ground motion.

In this program, the dam–reservoir-foundation system is discretized using solid hexahedral finite elements with 20 nodes; these are isoparametric elements, and integration is achieved with 2nd degree interpolation functions and 27 Gauss points. The main discontinuities, e.g., dam–foundation interface, joints, or cracks, are discretized using compatible interface elements with 16 nodes and 9 integration Gauss points (Figure 5).

**Figure 5.** Types of finite elements used for discretization of the dam–reservoir-foundation system and of dam discontinuities.

#### *4.2. Non-Linear Time-Stepping Method for Non-Linear Seismic Analysis*

The non-linear seismic response of the dam–reservoir–foundation system is calculated using a non-linear time-stepping method, considering both the structural effects due to the joint movements and the nonlinear behavior of concrete with softening under tension and compression [62]. The implemented method combines a time-stepping formulation, based on the application of the principles of the Newmark method [80], to the coupled dynamic problem, and a stress-transfer iterative method [81] to simulate non-linear dam behavior within each time step t+∆t. The goal is to solve the coupled dynamic equation in time domain

$$\mathbf{M}\_{\mathbf{\bar{i}} + \Delta t} + \mathbf{\bar{C}} \underset{\mathbf{t} + \Delta t}{\mathbf{\bar{t}}}\_{\mathbf{t} + \Delta t} + \mathbf{\bar{K}} \underset{\mathbf{\bar{t}} + \Delta t}{\mathbf{\bar{t}}}\_{\mathbf{t} + \Delta t} = \mathbf{\bar{F}}\_{\mathbf{t} + \Delta t} + \mathbf{\bar{F}}\_{\mathbf{t} + \Delta t} \tag{3}$$

where Ψ ˜ is the vector of unbalanced forces, which is introduced in the problem to reproduce the stress redistribution process that takes place as structural non-linear behavior progresses.

In *DamDySSA*, the stress-transfer iterative process is conducted in each time step t+∆t, being divided into two iterative sub-processes: the first, to simulate the effects due to joint movements, and the second, to model the concrete behavior up to failure under tension and compression [62]. Therefore, the unbalanced forces that arise in the iterative process are associated with the unbalanced stresses due to joint and concrete non-linear behavior. The unbalanced stresses are computed as the difference between the installed stresses and the material strength, considering the following constitutive models (Figure 6). The non-linear joint behavior is simulated using a constitutive model based on the Mohr–Coulomb failure criterion (Figure 6a), with or without cohesion, and considering appropriate normal/shear stress-displacement laws to account for opening/closing and sliding movements [55,56], and (ii) the concrete behavior up to failure is reproduced using a 3D isotropic damage model with strain-softening and two independent scalar damage variables, *d +* for damage under tension and *d* − for damage under compression [82,83] (Figure 6b).

**Figure 6.** (**a**) Non-linear joint model with cohesion and (**b**) Concrete constitutive model with two independent damage variables.

#### **5. Results: Seismic Safety Assessment of Arch Dams**

The ETA-based methodology presented in Section 3 is applied to evaluate the seismic performance of two large arch dams, namely the 132 m-high Cabril Dam (Portugal) and the 170 m-high Cahora Bassa Dam (Mozambique), and the non-linear seismic simulations are conducted using the program *DamDySSA*. Performance endurance limits associated with the evolution of tensile damage and compressive damage are determined and compared with peak ground accelerations for the OBE and the SEE. This section presents the case studies and the main results of this work.

#### *5.1. Case Study I: Cabril Dam (130 m-High)*

#### 5.1.1. Dam Description and Finite Element Mesh

The first case study is the iconic Cabril Dam (Figure 7), the highest dam in Portugal and an essential part of the country's infrastructure, in operation since 1954. Cabril Dam is a 132 m-high double curvature arch dam, with a 290 m-long crest; this dam was designed with a unique geometry, as seen in the central cantilever, where the cross-section thickness varies between a maximum of 20 m, near the dam base, and a minimum of 4.5 m, about 7 m below the crest, increasing again to 7 m at the crest level. The dam was constructed on a good quality granite rock mass, and it impounds a reservoir with an area of around 20 million m<sup>2</sup> and an effective storage of about 615 million m<sup>3</sup> . The reservoir level usually ranges from a minimum at el. 265 m to the normal level (NWL) at el. 295 m.

**Figure 7.** Case study I: Cabril Dam. Location and seismic hazard zones. Upstream, cross-section and plan views. Variation of the reservoir level over time.

Located in the center of Portugal, Cabril Dam is integrated in a national region of high seismic risk, close to some active intraplate faults. A seismic risk study has not been conducted at the dam site as of yet, so there are no reference peak ground acceleration values for the OBE and the SEE. Nevertheless, a seismic hazard analysis was conducted for a site not very far from Cabril Dam, to the north-northwest, for which peak ground accelerations of 0.06 g (OBE) and 0.14 g (MDE) were prescribed [75]. Since Cabril Dam is in an area of higher seismic risk, similar or higher earthquake ground motion may be expected. Thus, peak ground accelerations of 0.1 g (OBE) and 0.2 g (MDE) have been assumed as reference for the seismic behavior studies to be conducted for Cabril Dam [62], in order to meet the recommendations of the Portuguese Standards for dam design [84].

Figure 8 shows the latest finite element mesh of the Cabril dam–reservoir–foundation system (with three elements in thickness in the dam body) and the main material properties. The dam concrete and foundation rock are assumed as isotropic materials, with Young's modulus E = 25 GPa and Poisson's ratio *v* = 0.2, while the water in the reservoir is considered a compressible fluid with a pressure wave propagation velocity *c<sup>w</sup>* = 1440 m/s. These material properties have been validated based on experimental results from vibration monitoring data under ambient/operational conditions and during seismic events [15–17]. For non-linear seismic analysis, all vertical contraction joints were incorporated into the dam mesh, using appropriate normal and shear stiffness values, null cohesion, and a 30◦ friction angle. The concrete constitutive damage law was adopted for all dam elements, with tensile strength *f<sup>t</sup> =* 3 MPa and compressive strength *f<sup>c</sup> =* −30 MPa.

**Figure 8.** Finite element mesh and material properties used for dynamic analysis of Cabril Dam.

5.1.2. Non-Linear Seismic Analysis and Seismic Safety Assessment

For seismic safety assessment, the non-linear seismic behavior of Cabril Dam is simulated considering a dynamic load combination that includes the dam self-weight (SW), the hydrostatic pressure for full reservoir (HP132), and a seismic load (SEISMICL) consisting of an acceleration time history designed for ETA, with accelerations increasing to about 1.5 g in 15 s (Figure 9). The non-linear response results show, when the seismic forces push the dam towards upstream, significant upstream displacements along the upper blocks and relative movements between the surfaces of the cantilevers. Furthermore, the

opening of the vertical contraction joints causes a release of the arch tensions at the top of the dam, which prevents the occurrence of tensile damage. However, the subsequent stress redistribution process that takes place leads to an increase in vertical stresses along the height of the cantilevers, with vertical tensions that end up surpassing the concrete strength, causing concrete damage.

**Figure 9.** Non-linear seismic analysis of Cabril Dam. Deformed shape and principal stresses for t = 5.4 s, radial displacement envelope at the central section (until t = 5.4 s), and displacement time history at the crest central point.

The evolution of the tensile and compressive damages obtained in the non-linear simulations under intensifying seismic accelerations is presented in Figure 10, aiming to make an overall assessment of the seismic performance of Cabril Dam and estimate the endurance limits according to the established performance criteria (recall Section 3).

Until t = 5 s, there is a gradual progression of tensile damage, and tensile failure ultimately occurs in several blocks along the upper part of the downstream face of the dam, and near the dam base, on the upstream side. Nevertheless, concrete cracking is mostly superficial and so this tensile damage state is considered acceptable. After that, between t = 5 s (a ≈ 0.5 g) and t = 6 s (a ≈ 0.6 g), there is an important increase in tensile damage at both upstream and downstream faces, with concrete cracking covering a significant part of the upper half of the dam, and already propagating from upstream to downstream in several blocks. This scenario could affect the structural integrity of the dam and thus

require the interruption of normal operating conditions for repairs, hence failing to meet the performance criterion defined in the proposed method for the OBE excitation level. Accordingly, the endurance limit for tensile damage (t = 5 s) corresponds to an acceleration of 0.5 g, which is five times the value assumed as the OBE peak ground acceleration (0.1 g) at the dam site.

**Figure 10.** Seismic safety assessment of Cabril Dam: evolution of tensile and compressive damage for increasing acceleration levels.

As for the compressive damage evolution, it is worth emphasizing that compressive damage began to arise only after t = 11 s (a ≈ 1.1 g), while the first occurrence of concrete compressive failure is reported at the top of the central cantilever, only after the dam was subjected to peak ground accelerations of about 1.3 g. Since the adopted performance criterion for compressive damage is based on the non-occurrence of concrete crushing with propagation across the blocks in key areas of the dam, which could induce collapse and uncontrolled release of water from the reservoir, the compressive endurance limit is at least 1.3 g, 6.5 times more than the peak ground acceleration considered for the MDE (0.2 g) for Cabril Dam, thus demonstrating its impressive resistant capacity.

#### *5.2. Case Study II: Cahora Bassa Dam (170 m-High)*

#### 5.2.1. Dam Description and Finite Element Mesh

The second case study is the Cahora Bassa Dam (Figure 11), located on the Zambezi River, near Songo, in western Mozambique. Built from 1969 to 1974, it is one of the highest dams in the African continent. Cahora Bassa is a thin 170 m-high double curvature arch dam, with a 303 m-long arch at the crest, which presents a unique half-hollow shape. The thickness of the central cantilever ranges from a maximum of 23 m at the dam base, to about 4 m at the crest level. In addition, the dam has one control surface spillway and eight half-height spillways. The dam was constructed on a gneissic granite rock mass of very good quality, and it impounds Lake Cahora Bassa, which is 270 km long and 30 km wide at is widest point. The reservoir level does not usually present significant changes, varying between about el. 326 m and el. 320 m.

Cahora Bassa Dam is located in an earthquake hazard area, not far from the East African Rift system, which extends from the Red Sea to the Indic Ocean, across Mozambique, an active continental rift that is responsible for most earthquake events in Eastern Africa. According to a study on the seismic behavior of the dam [76], for seismic safety assessment the recommendations of the Portuguese Standards for dam design [84] can be followed. As such, the OBE and the MDE excitation levels must be considered for the evaluation of both regular and failure scenarios; the peak ground accelerations values to be used are those determined in the seismic hazard evaluation conducted for the Cahora Bassa Dam area [76], of 0.076 g (OBE) and 0.102 g (MDE).

**Figure 11.** Case study II: Cahora Bassa Dam. Location and seismic hazard zones. Upstream, crosssection and plan views. Variation of the reservoir level over time.

The most recent finite element mesh of the Cahora Bassa dam–reservoir–foundation system is presented in Figure 12. The dam mesh, with three elements in thickness, replicates the real dam geometry quite well; still, in this version, the half-hollow crest shape and the geometry of the spillways are represented in a simplified way. The dam concrete and the foundation rock are considered to be isotropic materials, using Young's modulus E = 40 GPa and Poisson's ratio *v* = 0.2, while the reservoir water is simulated as a compressible fluid with a pressure wave propagation velocity c<sup>w</sup> = 1500 m/s. All material properties have been calibrated using dynamic experimental data, including modal parameters estimated from measured vibrations and seismic response results [15–17]. In order to simulate the non-linear structural response, vertical contraction joints were introduced in the dam body, considering calibrated stiffness values, null cohesion, and a 30◦ friction angle; the non-linear behavior of concrete is reproduced using the constitutive damage law with tensile strength *f<sup>t</sup> =* 3 MPa and compressive strength *f<sup>c</sup> =* −30 MPa for all dam elements.

**Figure 12.** Finite element mesh and material properties used for dynamic analysis of Cahora Bassa Dam.

5.2.2. Non-Linear Seismic Analysis and Seismic Safety Assessment

In order to conduct the seismic safety evaluation of Cahora Bassa Dam, the nonlinear seismic response (Figure 13) is computed for a load combination including the dam self-weight (SW), the hydrostatic pressure for full reservoir (HP170), and a seismic load (SEISMICL) represented by the acceleration time history designed for ETA. The obtained non-linear response results enable us to see that, when the larger dynamic motions are in the upstream direction, the lateral cantilevers move towards upstream while the central cantilevers move in the opposite direction. In terms of structural effects due to the joint movements, similarly to the previous case study, the opening of the vertical contraction joints cause a reduction of the arch effect, and thus the arch tensions along the top of the dam are released, hence avoiding tensile damage at the upper blocks. However, the subsequent stress redistribution originates an increase in vertical stresses along the height of the dam cantilevers, and vertical tensions become greater than concrete tensile strength.

**Figure 13.** Non-linear seismic analysis of Cahora Bassa Dam. Deformed shape and principal stresses for t = 3.2 s, radial displacement envelope for the lateral cantilever (until t = 3.2 s), and displacement time history at the crest of the lateral cantilever.

Lastly, the seismic safety assessment of Cahora Bassa Dam and the determination of performance endurance limits under the intensifying seismic excitation is conducted based on the evolution of the tensile and compressive damage (Figure 14), taking into account the established performance criteria (see Section 3).

The tensile damage evolution up to t = 5 s (a ≈ 0.5 g) shows that superficial concrete failure occurs near the upstream base and along the upper half of the downstream face of the dam. Since there is no sign of concrete cracking propagating across the thickness of the cantilevers, the resulting tensile damage state may be considered acceptable. However, past that there is a considerable propagation of concrete cracking over the upstream and downstream faces, and there are several blocks on most cantilevers where tensile failure has propagated through the entire section. Naturally, this scenario should be considered unacceptable, given the adopted criterion for the OBE. As such, the endurance limit related to tensile damage corresponds to an acceleration value of about 0.5 g, 6.5 times higher than the OBE peak ground acceleration (0.076 g) prescribed for Cahora Bassa Dam.

In what concerns the compressive damage evolution, until t = 6 s, there are no signs of damage due to compressions. After that point, however, compressive damage starts to increase progressively, specifically at the top of the lateral cantilevers, on the upstream face, and below the surface spillway, on the downstream face, until t = 9 s, when concrete crushing due to compressive failure has occurred from upstream to downstream in the blocks under the surface spillway. Bearing in mind the adopted performance criteria to meet the requirements under the SEE level, this scenario would not be acceptable as it could lead to local collapse and hence to the uncontrolled release of the reservoir. Thus, the endurance limit associated with compressive failure is set to around 0.8 g, about eight times the peak ground acceleration value of the MDE (0.102 g) for Cahora Bassa Dam.

**Figure 14.** Seismic safety assessment of Cahora Bassa Dam: evolution of tensile and compressive damage for increasing acceleration levels.

#### *5.3. Discussion*

Section 5 provided the mains results of the studies conducted for the seismic safety assessment of two large arch dams, namely Cabril Dam and Cahora Bassa Dam, using the proposed ETA-based method (Section 3) and the finite element program *DamDySSA* (Section 4). Considering the adopted performance criteria, established to meet the requirements for the OBE and SEE, endurance limits were determined for both dams based on the evolution of tensile and compressive damage under intensifying seismic excitation.

For the case of Cabril Dam, the endurance limits were of 0.5 g for tensile damage and of at least 1.3 g for compressive damage: these values are, respectively, 5 and 6.5 times greater than the peak ground accelerations assumed for the OBE (0.1 g) and for the MDE (0.2 g). Therefore, the conducted seismic performance evaluation demonstrated the impressive resistant capacity of Cabril Dam under seismic loads. Moreover, the achieved results suggest that the seismic safety of the dam might be clearly verified in future studies, in case similar or even higher values than those assumed here are defined for the OBE and the MDE in seismic hazard studies carried out at the dam site.

As for Cahora Bassa Dam, the endurance limits were of about 0.5 g in terms of tensile damage, and around 0.8 g in what concerns compressive damage. These are acceleration values about 6.5 and 8 times greater, respectively, than the OBE (0.076 g) and MDE (0.102 g) peak ground accelerations prescribed for the Cahora Bassa Dam site. This seismic safety study showed that, despite being a thin 170 m-high arch dam, Cahora Bassa Dam performs very well under strong seismic loads, as both endurance limits were considerably greater than the values of the OBE and MDE peak ground accelerations.

Finally, the same tensile and compressive concrete strength values were considered for Cabril Dam and Cahora Bassa Dam, so it is worth comparing their seismic performance. On the one hand, Cahora Bassa Dam is taller and thinner than Cabril Dam, hence higher stresses tend to get installed. Thus, since compressions are considerably higher in Cahora Bassa Dam [55], the endurance limit associated with compressive damage is lower than for Cabril Dam. On the other hand, it was interesting to see that both arch dams presented similar tensile damage evolutions until t = 5 s, resulting in equal endurance limits; nevertheless, after that, the tensile damage progression was more severe for the case of Cahora Bassa Dam. Overall, by comparing the endurance limits with the prescribed values of the peak ground acceleration for the OBE and the SEE, both dams presented considerable safety factors. In the case Cahora Bassa Dam, located in an area of lower seismicity, the safety factors are slightly higher than for Cabril Dam.

#### **6. Conclusions and Future Research**

The seismic safety assessment is a fundamental issue for dam safety control, and therefore it is essential to develop suitable methods of analysis and robust models in order to analyze the seismic behavior of dams under strong earthquakes.

To contribute to this field, an ETA-based method for seismic safety assessment of arch dams, using tensile and compressive damage results from advanced non-linear seismic analyses considering joints and concrete non-linear behavior, was presented. In the ETA-based method, a new and intuitive approach was proposed for seismic performance evaluation, namely by controlling the evolution of the damage state of the dam. Considering suitable performance criteria, acceleration endurance limits are estimated based on the evolution of tensile and compressive damage, and then compared, respectively, with reference peak ground accelerations prescribed for the OBE and the SEE of the dam site. With this approach, it is possible to conduct the seismic safety assessment in relation to both earthquake levels (OBE and MDE) in an efficient manner, by conducting a single intensifying acceleration time history analysis.

An important innovation was also achieved in the finite element program *DamDySSA*, used to carry out the seismic analyses for this work. Several years under development in LNEC for dynamic analysis of dam–reservoir–foundation systems, the program is based on a coupled finite element formulation in displacements and pressures, and a robust

formulation was recently implemented for non-linear seismic analysis, considering (a) nonlinear joint behavior, using a constitutive model based on the Mohr–Coulomb failure criterion and using normal/shear stress-displacement laws for opening/closing and sliding movements, and (b) the concrete behavior up to failure, using an isotropic damage model with strain-softening and independent tensile and compressive scalar damage variables. A considerable investment was also made in programming the graphical outputs, to achieve realistic 3D representations of the damage in the dam body and hence facilitate the results analysis and interpretation.

The ETA-based method was then used to evaluate the seismic performance of the 132 m-high Cabril Dam (Portugal) and the 170 m-high Cahora Bassa Dam (Mozambique). The achieved results confirmed the potential of the proposed methodology for seismic safety assessment of concrete dams and showed that it enables a simple and intuitive analysis, which can be very beneficial for dam safety officers and dam owners. Therefore, the method can be used to support the seismic design and safety assessment of new dams or to conduct analyses for seismic safety reassessment of older dams that have been designed based on outdated methods and regulations; for example, a program for seismic safety reassessment of older large concrete dams of high potential risk could be carried out in Portugal and Mozambique, and the tools presented in this paper could prove useful. The proposed method can also be used to conduct seismic safety evaluations of dams for scenarios considering different reservoir water levels, distinct concrete strength properties, and using various intensifying seismic accelerogram with different frequency content.

In terms of future research, the proposed ETA-based methodology may be improved by including specific parameters that allow for a more objective quantification of the global damage state of the dam. In what concerns the program *DamDySSA*, it could be enhanced by implementing parallelization techniques, using GPU computing, in order to increase the computational efficiency of the non-linear analyses.

**Author Contributions:** Conceptualization, A.A. and S.O.; data curation: A.A., S.O. and P.M.; formal analysis: A.A. and S.O.; investigation: A.A., S.O., P.M., J.P. and R.R.; methodology: A.A. and S.O.; software: A.A. and S.O.; validation: A.A., S.O., P.M. and R.R.; visualization: A.A. and S.O.; writing original draft: A.A. and S.O.; writing—review and editing: A.A., S.O., P.M., J.P., R.R. and E.C.; supervision: A.A. and S.O.; funding acquisition: S.O., P.M. and J.P.; project administration: A.A., S.O., P.M., J.P. and E.C.; resources: A.A., S.O., P.M., J.P. and E.C. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Foundation for Science and Technology (FCT) in the framework of the project PTDC/ECI-EGC/5332/2020 - Seismic Monitoring and Structural Health of Large Concrete Dams (SSHM4Dams), involving LNEC, ISEL-IPL and IST-ID, and also within the scope of the project UIDB/04625/2020, which is under development at the Center for Research and Innovation in Civil Engineering for Sustainability (CERIS).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data presented in this study may be available on request from the corresponding author.

**Acknowledgments:** The authors thank Eletricidade de Portugal (EDP) and Hidroeléctrica de Cahora Bassa (HCB) for allowing the use of Cabril Dam and Cahora Bassa Dam, respectively, as case studies.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **An Automated Machine Learning Engine with Inverse Analysis for Seismic Design of Dams**

**Mohammad Amin Hariri-Ardebili 1,2,\* and Farhad Pourkamali-Anaraki <sup>3</sup>**


**Abstract:** This paper proposes a systematic approach for the seismic design of 2D concrete dams. As opposed to the traditional design method which does not optimize the dam cross-section, the proposed design engine offers the optimal one based on the predefined constraints. A large database of about 24,000 simulations is generated based on transient simulation of the dam-foundation-water system. The database includes over 150 various dam shapes, water levels, and material properties, as well as 160 different ground motion records. Automated machine learning (AutoML) is used to generate a surrogate model of dam response as a function of thirty variables. The accuracy of single- and multi-output surrogate models are compared, and the efficiency of the design engine for various settings is discussed. Next, a simple yet robust inverse analysis method is coupled with a multi-output surrogate model to design a hypothetical dam in the United States. Having the seismic hazard scenario, geological survey data, and also the concrete mix, the dam shape is estimated and compared to direct finite element simulation. The results show promising accuracy from the AutoML regression. Furthermore, the design shape from the inverse analysis is in good agreement with the design objectives and also the finite element simulations.

**Keywords:** design variable; finite element; feasibility design; surrogate; gravity dams; AutoML

#### **1. Introduction**

Seismic design and analysis of concrete dams have been always challenging tasks because multiple factors are involved in performance evaluation [1]. They include, but are not limited to, the semi-unbounded size of the reservoir and foundation rock domains, fluidstructure interaction, wave absorption at the reservoir boundary, water compressibility, foundation rock-structure interaction, spatial variations in ground motion at the damfoundation interface, and nonlinear damage mechanism of dam concrete. A detailed review of the dynamic analysis of concrete dams can be found in [2]. However, the seismic safety of existing dams is different from the seismic design of new ones. Whereas linear elastic analyses are warranted for design, nonlinear ones must be performed when the complete structural response is desired, the failure load is to be determined as accurately as possible, or the "true" factor of safety must be found [3].

Structural design is founded on verification of the safety inequality: "Demand ≤ Capacity". This inequality can be interpreted with different engineering response quantities which results in various seismic design philosophies. In a broad classification, the seismic design is force-based, displacement-based, or energy-based. If both sides of safety inequality are written based on forces or moments, the design is force-based, and if displacement or deformation (e.g., deflection, curvature, strain, and rotation) is used, the design is displacement-based. Finally, if energy terms are compared, the design is energy-based [4].

**Citation:** Hariri-Ardebili, M.A.; Pourkamali-Anaraki, F. An Automated Machine Learning Engine with Inverse Analysis for Seismic Design of Dams. *Water* **2022**, *14*, 3898. https://doi.org/10.3390/w14233898

Academic Editor: Paolo Mignosa

Received: 20 October 2022 Accepted: 27 November 2022 Published: 30 November 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

In force-based design, the lateral load-resisting system is designed for an equivalent static force. However, in the displacement-based method, multiple (drift-based) performance levels are checked to ensure the displacement does not exceed the threshold values. In force-based design, the structural system is designed for a single seismic hazard level, i.e., the design basis earthquake (DBE) in which the structure should satisfy the life safety performance objective. However, the satisfaction of one performance level does not guarantee the satisfaction of other (i.e., higher) performance levels too. In contrast, the displacement-based design operates on multiple performance levels to satisfy all of them. It is noteworthy that an earthquake originally imposes energy (and not force) on a structure. Such energy (through the foundation) produces displacement relative to the ground. The forces are indeed the byproduct of such a relative displacement and not the other way around. Therefore, a displacement-based method is straightforward and more detailed.

Concrete gravity dams have traditionally been designed by an extended version of force-based procedure [5,6]. As opposed to framed structures which include only one equivalent lateral force, there are three static lateral forces in dams: (1) the forces associated with the weight of the dam which is obtained as a product of a seismic coefficient (*α* ∈ [0.05 − 0.10]) and the weight of the portion of the dam being considered, (2) the forces associated with reservoir hydrodynamic pressure which are obtained as a product of *α* and a pressure coefficient, *αp*, and (3) the hydrostatic pressure [7]. There are several limitations in this method: the dynamic characteristics of the coupled system, as well as the time and frequency-domain characteristics of the ground motion records, are not considered. The traditional design method has also very conservative criteria: the compressive stress should be limited to 1/4 of the compressive strength, *f* 0 *c* , and the tension is usually not permitted, or the allowable tensile stress is very small. In addition, the static sliding and overturning criteria have little meaning in the context of traditional seismic dam design as the oscillatory responses are ignored.

The authors are unaware of any previous attempt at the displacement-based design of concrete dams. The only accessible document is the research by Andonov et al. [8] which proposed to use of a displacement-based method for linear and nonlinear seismic assessment of the existing dams. Beyond the previous classification for seismic design methods, other extensions have been proposed such as performance-based seismic design [9], reliability-based seismic design [10], risk-based seismic design [11], and more recently, the resilience-based seismic design [12]. Despite the availability of such advanced seismic design frameworks, there are rarely used in concrete dam design probably due to the complexity of the numerical model. For example, Ferguson et al. [13] showed the application of the risk-informed design framework for roller-compacted concrete dams under extreme seismic events.

On the other hand, the shape optimization of the existing dam layout has been discussed widely for both gravity and arch dams. The pioneering work belongs to Ramakrishnan and Francavilla [14] that used the penalty function to optimize the shape of gravity dams. Others adopted simple or advanced optimization algorithms for dam shape optimization [15–17]. Some others combined the optimization algorithms with machine learning to accelerate the process [18,19]. A risk-based framework for shape optimization of arch dams was introduced by Talatahari et al. [20] on which expected costs of failure are incorporated in the analysis. More recently, a surrogate–assisted shape optimization framework for dams was introduced by Fengjie and Lahmer [21] which incorporates various uncertainty sources. This method has been extended by Abdollahi et al. [22] for multiple seismic performance levels by eliminating the design dependency on a particular ground motion record. Nearly all these techniques require advanced knowledge of optimization techniques and/or machine learning modeling. Moreover, a large number of finite element simulations is required to satisfy the objective functions. Therefore, they are not popular among practitioners. Moreover, a series of particular cost functions are required during the shape optimization such as volume of concrete, construction quality and complexity, location of dam and availability of materials, etc. which makes the generalization of results nearly impossible.

With multiple limitations in the traditional seismic design method, and also the complexity of the advanced shape optimization techniques, there is a need for a simple yet accurate seismic design framework. This paper proposes a finite element-based design engine to assist the practitioners in feasibility level (i.e., initial) layout development for 2D concrete gravity dams under seismic events. The engine includes a large inventory of gravity dam shapes with different material properties for concrete and foundation rock. Such a large inventory has been subjected to different water levels and many ground motion records. The current database covers about 24,000 unique combinations of dam shapes, material properties, water, and earthquake loading. Further, a low-code automated machine learning (AutoML) tool is used to develop a high-fidelity surrogate model that connects all the design variables to response quantities (e.g., displacements and stresses). Such an AutoML surrogate model has never been trained for structural systems (more specifically dams) with both the epistemic (i.e., modeling and material) and aleatory (i.e., loading) variability [23,24]. Therefore, the first contribution of this paper is to explore the accuracy of such a surrogate model with different assumptions that an AutoML prepackage is provided for analysts. Next, a surrogate model-based inverse analysis is introduced for initial parameter estimation during dam design. Using the generated engine in this paper, and also having some information about the seismic event, the engineer will be able to estimate the dam shape for different levels of response quantity in a second. This engine provides the best initial guess (shape and material of the dam) based on the constraints that are introduced by the analyst.

The paper's structure is as follows: Section 2 provides a quick review of the AutoML and its differences with classical machine learning approaches. Furthermore, a high-level review is provided about the application of machine learning in the seismic analysis of dams. Section 3 provides the underpinning theories about the design variables used in this paper to develop the surrogate models. The data structure is discussed in Section 4 including a brief explanation of the software used for finite element simulations, and also some generic responses from the database. Section 5 dives into the AutoML application, the anatomy, and performance of the developed surrogate models, while Section 6 explores the design engine and inverse problems, as well as some practical examples. Finally, a summary of the research this provided in Section 7.

#### **2. Automated Machine Learning (AutoML)**

#### *2.1. ML-Based Response Evaluation of Dams*

Most of the current applications of machine learning in dam engineering are focused on structural health monitoring which mainly compiles the measured data during the lifetime of the dam and predicts the response trend [25,26]. This is not the focus of our paper. In this paper, (automated) machine learning is used to post-process the results of finite element simulations. Studies in this field are limited and there is no comprehensive research on comparing different techniques.

Chen et al. [27] evaluated the probability of sliding in a dam using an improved response surface method. Karimi et al. [28] proposed a neural network procedure for system identification of gravity dams coupled with a hybrid finite element-boundary element analysis to estimate the dynamic characteristics of an empty dam. Gaspar et al. [29] conducted a global sensitivity analysis of the thermo-chemo-mechanical coupled model of RCC's physical properties. Gu et al. [30] proposed a chaos genetic optimization algorithm to invert the initial zoning deformation modulus and determine the inversion objective function using the measured displacement and finite element method. Su et al. [31] proposed the application of least squares support vector machine and conditional back analysis for optimal selection of dam parameters. Hariri-Ardebili and Pourkamali-Anaraki [32,33] showed the application of several machine learning techniques in the multi-hazard analysis of gravity dams. Both the simplified and nonlinear damage analyses were performed including the

seismic, hydrologic, and aging hazard sources. Seismic reliability and sensitivity of concrete dams were investigated with polynomial chaos expansion (PCE) [34] and adaptive Kriging methods [35].

Segura et al. [36] developed a series of seismic fragility curves for concrete dams using various machine learning methods. Macedo et al. [37] developed a series of new models for estimating seismically-induced slope displacements based on various machine learning techniques. Zhou et al. [38] coupled the support vector machine with a plastic failure model for fragility analysis of concrete-faced rockfill dams. Cheng et al. [39] proposed two back-analysis frameworks based on multivariate machine learning models to determine the dynamic properties of the material in concrete dams. Salazar and Hariri-Ardebili [40] combined the random forest method with stochastic finite element procedure to evaluate the impact of concrete heterogeneity in dams. A PCE and Random Forest-based model is also used for sensitivity analysis of heterogeneous arch dams [41]. Segura et al. [42] developed a dual-layer meta-model for the safety assessment of rock wedges. Hariri-Ardebili et al. [43] proposed a machine learning-aided probabilistic seismic demand model for concrete dams using both real and artificial ground motions. Li et al. [44] developed an efficient methodology for risk analysis of dams with a large number of seismic waves which is based on screening for intensity measures and a surrogate model.

While many of the above-mentioned researches have adopted multiple machine learning (or surrogate) methods to generalize the findings, those methods were selective depending on the personal preference of the analyst (or the availability and/or capability of the software), and thus, there is no generalized recommendation regarding the efficiency of a particular method. Moreover, none of them have used an automated machine learning framework for regression or classification purposes. In the following section, the concept of AutoML is explained.

#### *2.2. Underpinning Theory*

Automated machine learning, also known as AutoML, is a growing field that aims to allow users with varying backgrounds and expertise to design an end-to-end machine learning system for the problem at hand [45,46]. The automation process facilitates several crucial and time-consuming aspects, including feature processing or engineering, model discovery, and hyperparameter tuning. Given a set of raw features or attributes, the first step is to find a set of meaningful and usable features to be passed to a machine learning method. Examples include converting categorical features to numerical values and feature scaling methods, such as standardization [47], to ensure that all features are in the same range and treated equally. Model discovery refers to finding the best learning method among a set of candidate machine learning methods. For example, in the context of regression which is the main focus of this paper, eligible machine learning methods may include linear/polynomial regression, instance-based learning techniques such as nearest neighbors, or even neural networks. On the other hand, hyperparameter tuning involves selecting the best hyperparameters for the learning method which is deemed to be the final choice. Hyperparameters can be viewed as external parameters that have to be specified before training machine learning models, and are known to significantly impact the outcome, such as the number of nearest neighbors when using instance-based models. In machine learning, the combination of model discovery and hyperparameter tuning is typically called model selection.

Therefore, AutoML holds great potential to make machine learning and data science more accessible across scientific disciplines to extract patterns and make data-driven decisions. For example, AutoML has been deployed in a wide array of applications, such as image-based plant phenotyping [48], fault severity diagnosis in industrial processes [49], reducing manufacturing costs [50], analyzing biological data [51], and predicting the casualty rate and economic loss induced by earthquakes [52]. Despite the recent progress in this area, we would like to highlight the proper way to evaluate the effectiveness of AutoML techniques. After finalizing a learning method and its hyperparameters using

AutoML, one should evaluate its performance on a test or hold-out data set to report its generalization error. This additional step allows us to decouple the model selection process and model assessment, reflecting the "true" performance of the selected model or surrogate when facing new cases. The overall procedure is depicted in Figure 1 via a flowchart. The training data set will be used along with an AutoML technique to select the final model and its hyperparameters, while the test data will be used to report evaluation metrics on unseen data to measure the generalization error.

**Figure 1.** Proper performance evaluation of AutoML techniques, by holding out part of the available data as a test set.

Given the above introduction of AutoML, we explain the underlying concepts in more detail. Let A = {*A* (1) , . . . , *A* (*N*)} represent a set of *<sup>N</sup>* eligible machine learning algorithms. Moreover, each *A* (*n*) , *n* = 1, . . . , *N*, comes with a set of hyperparameter configurations, represented by Λ(*n*) . Furthermore, let us assume that the training data in Figure 1, called <sup>D</sup>train is split into *<sup>K</sup>* cross-validation folds {D(1) train, . . . , D (*K*) train} and {D(1) valid, . . . , D (*K*) valid}, such that D (*k*) train <sup>=</sup> <sup>D</sup>train \ D(*k*) valid for *k* = 1, . . . , *K*. Given a loss function L that measures the prediction quality, we make the assumption that L(*A* (*n*) *λ* , D (*k*) train, D (*k*) valid) represents the loss evaluated on the validation data D (*k*) valid using the training data D (*k*) train and the learning method *A* (*n*) with its corresponding hyperparameter choices *<sup>λ</sup>* <sup>∈</sup> <sup>Λ</sup>(*n*) . With this notation in place, the main idea behind automated machine learning is to solve the following minimization problem in an efficient and robust manner:

$$\underset{A^{(n)} \in \mathcal{A}, \lambda \in \Lambda^{(n)}}{\arg\min} \frac{1}{K} \sum\_{k=1}^{K} \mathcal{L}(A\_{\lambda}^{(n)}, \mathcal{D}\_{\text{train}}^{(k)}, \mathcal{D}\_{\text{valid}}^{(k)}) \tag{1}$$

The solution to the above problem results in finding the best learning method and its hyperparameters, which can be used as a surrogate model to capture the behavior of the desired system as accurately as possible. As mentioned before and depicted in Figure 1, an additional step is to evaluate the performance of the selected model on a hold-out test data set to ensure that the model generalizes well beyond the existing data Dtrain.

Among available AutoML techniques/frameworks, we have decided to use autosklearn [53,54] in this paper because of five main reasons. First, auto-sklearn is a Pythonbased open-source toolkit, which resembles the widely-used scikit-learn machine learning package, also known as sklearn [55]. This means that we can use similar methods such as "fit" and "predict" to train and evaluate models, respectively. Second, during the optimization process, auto-sklearn can automatically create an ensemble of top-performing models, instead of reporting a single model with the highest accuracy. To be more formal, the final solution of auto-sklearn can take the form of ∑*<sup>n</sup> βnA* (*n*) *λ* , where the weights should satisfy 0 ≤ *β<sup>n</sup>* ≤ 1 and ∑*<sup>n</sup> β<sup>n</sup>* = 1. As a result, the top-performing models will have *β<sup>n</sup>* > 0 to contribute to the final surrogate model. It has been shown that ensemble methods provide an efficient way to improve predictive accuracy, e.g., [56], which makes auto-sklearn a very attractive choice. Third, auto-sklearn allows us to solve multi-output problems in which the goal is to predict multiple quantities of interest at the same time. This is an important feature of auto-sklearn because, for the application of seismic design of dams, we have to describe the system's behavior using various quantities, and training distinct models for each quantity becomes intractable. Currently, PyCaret, which is another popular AutoML

framework [57], does not support multi-output regression models, which is a substantial drawback for many applications, including the problem of interest in this paper.

The fourth reason is that auto-sklearn runs within a user-determined time budget, with the default value of one hour. Therefore, the user has the option to spend more or less time depending on the computational requirements and the availability of resources. Finally, the fifth reason is that the search space of auto-sklearn is significantly large and considers various regression models and classifiers from the scikit-learn library. For example, in the most recent version of auto-sklearn 0.15.0 that we use in this paper, the following regression models A are included in the search space:


Although we refer interested readers to the sklearn documentation page for more detailed information and updates regarding these models and their implementations, we can immediately see the diversity of the models that are included in solving the above optimization problem. To demonstrate the ease-of-use of auto-sklearn for practitioners without a deep knowledge of machine learning methods, we show relevant code snippets for performing the three main steps involved in our proposed framework in Listing 1: (1) dividing the available data into training and test sets, (2) model selection using Dtrain, and (3) model assessment via Dtest. To save space, we listed *R* <sup>2</sup> or the coefficient of determination as the only score to measure the quality of predictions, but we will use a broader list of evaluation metrics in our numerical results. As a final point, once autosklearn finds the best model according to the search space and the given time budget, we can store/load the surrogate model "automl" to make predictions in the future. <sup>209</sup> • Probabilistic model: Gaussian process (GP) regression; <sup>210</sup> • K-nearest neighbor (KNN) and support vector regression (SVR); <sup>211</sup> • Neural networks: Multilayer perceptron (MLP). <sup>212</sup> Although we refer interested readers to the sklearn documentation page for more detailed <sup>213</sup> information and updates regarding these models and their implementations, we can immediately <sup>214</sup> see the diversity of the models that are included in solving the above optimization problem. To <sup>215</sup> demonstrate the ease-of-use of auto-sklearn for practitioners without a deep knowledge of machine <sup>216</sup> learning methods, we show relevant code snippets for performing the three main steps involved in <sup>217</sup> our proposed framework: (1) dividing the available data into training and test sets, (2) model selection using Dtrain, and (3) model assessment via Dtest. To save space, we listed *R* 2 <sup>218</sup> or the coefficient of <sup>219</sup> determination as the only score to measure the quality of predictions, but we will use a broader list of <sup>220</sup> evaluation metrics in our numerical results. As a final point, once auto-sklearn finds the best model <sup>221</sup> according to the search space and the given time budget, we can store/load the surrogate model

**Listing 1.** Sample of auto-sklearn code to perform the proposed framework. <sup>222</sup> "automl" to make predictions in the future.

```
224 from sklearn.model_selection import train_test_split
225 from sklearn.metrics import r2_score
226 from autosklearn.regression import AutoSklearnRegressor
228 # step 1: train/test split, (X, y) is the entire data set
229 X_train, X_test, y_train, y_test = train_test_split(
230 X, y, test_size=0.2, random_state=0)
232 # step 2: model selection, time budget: 1 hour or 3600 sec
233 automl = AutoSklearnRegressor(
234 time_left_for_this_task=3600)
236 automl.fit(X_train, y_train)
238 # step 3: model assessment (R2 or other metrics)
239 r2_score(y_test, automl.predict(X_test)) 240
```
#### <sup>241</sup> **3. Design Variables 3. Design Variables**

<sup>253</sup> practices or an existing dam.

<sup>254</sup> • *L*<sup>1</sup> ∈ (50, 150) m.

<sup>246</sup> *3.1. Dam Shape*

223

227

231

235

237

<sup>242</sup> The feasibility level design of a gravity dam includes the selection of an appropriate cross-section <sup>243</sup> including the material properties for the coupled system that satisfy the design objectives under the <sup>244</sup> applied loads. Figure 2 illustrates a generic gravity dam including the dimensions, material properties, <sup>245</sup> and loading. The feasibility level design of a gravity dam includes the selection of an appropriate cross-section including the material properties for the coupled system that satisfy the design objectives under the applied loads. Figure 2 illustrates a generic gravity dam including the dimensions, material properties, and loading.

 Probably the most important task during seismic design is to select an optimal initial cross-section. Multiple sources offer a cross-section for concrete gravity and arch dams. A large inventory of dams has been studied and a generic dam shape is developed using seven length-related variables as shown in Figure 3 (*L*<sup>1</sup> to *L*7). All dimensions are a random dependent of the dam base. Also, the reservoir is modeled by assuming a random water level between 50-100% of dam height (corresponding to winter and summer conditions). All the generated dam layouts are consistent with current dam design

<sup>255</sup> • *L*<sup>2</sup> = *L*<sup>1</sup> × *α*1; *α*<sup>1</sup> ∈ (0.00, 0.05) −→ *L*<sup>2</sup> ∈ (0, 7) m. <sup>256</sup> • *L*<sup>3</sup> = *L*<sup>4</sup> × *α*3; *α*<sup>3</sup> ∈ (1.00, 1.20) −→ *L*<sup>3</sup> ∈ (7, 40) m.

**Figure 2.** Generic shape of a gravity dam including design variables, material parameters, water, and seismic loads.

#### *3.1. Dam Shape*

Probably the most important task during seismic design is to select an optimal initial cross-section. Multiple sources offer a cross-section for concrete gravity and arch dams. A large inventory of dams has been studied and a generic dam shape is developed using seven length-related variables as shown in Figure 3 (*L*<sup>1</sup> to *L*7). All dimensions are a random dependent of the dam base. Furthermore, the reservoir is modeled by assuming a random water level between 50–100% of dam height (corresponding to winter and summer conditions). All the generated dam layouts are consistent with current dam design practices or an existing dam.


#### *3.2. Material Properties*

The design of a new dam requires the definition of material properties for finite element simulations. The concrete properties mainly depend on the mix design, and also the availability of the ingredients (e.g., sand, gravel, and cement) near the dam site. The rock properties are typically obtained from geological surveys. However, for feasibilitylevel design, the exact rock properties might not be available yet. Moreover, the reservoir bottom reflection coefficient, *αw*, is needed which simulates the impact of bottom sediments and alluvium. For new dams, *α<sup>w</sup>* > 0.9 is typically used. However, to account for long terms effects, smaller values should be used. Seven properties are assumed to be unknown during the design process. Each one covers a wide range of possible values.

Concrete modulus of elasticity *E<sup>c</sup>* ∈ [15, 45] GPa, concrete Poisson's ratio *ν<sup>c</sup>* = 0.2 (fixed), concrete mass density *<sup>ρ</sup><sup>c</sup>* <sup>∈</sup> [2200, 2600] kg/m<sup>3</sup> , concrete hysteretic damping *η<sup>c</sup>* ∈ [0.02, 0.10], rock modulus of elasticity *E<sup>r</sup>* ∈ [15, 45] GPa, rock Poisson's ratio *ν<sup>r</sup>* = 0.33 (fixed), rock mass density *<sup>ρ</sup><sup>r</sup>* <sup>∈</sup> [2200, 2800] kg/m<sup>3</sup> , rock hysteretic damping *η<sup>r</sup>* ∈ [0.02, 0.08], and the reservoir bottom wave reflection coefficient *α<sup>w</sup>* ∈ [0.5, 0.9]. All material properties are sampled based on a truncated normal distribution using the Latin Hypercube sampling technique. No correlation is assumed among these variables. Figure 4 shows the distribution of the material properties used for surrogate modeling. Concrete compressive strength, *f* 0 *c* , is not directly used in finite element analyses; however, the results of linear

elastic simulations should be compared to tensile (*f* 0 *<sup>t</sup>* ≈ 0.1 *f* 0 *c* ) and compressive strength to ensure the demand does not exceed the capacity.

**Figure 3.** Inventory of all gravity dam shapes generated based on a Matlab code including a random water level in red; the box size is 170 × 240 m in all cases.

**Figure 4.** Distribution of material properties.

#### *3.3. Loads*

The inputs to the finite element model include both the ground motion records and the water pressure (both hydrostatic and hydrodynamic components). As discussed earlier, the water level is assumed to be 50–100% of the dam height. Subsequently, the corresponding hydrostatic and hydrodynamic pressures are automatically computed and applied by the software. For the seismic simulations, a large database of 160 ground motions is selected worldwide to consider the aleatory uncertainty. While the current practice in earthquake engineering is to select the ground motion records based on the seismic hazard characteristics of the dam site, a random record selection process is used in this paper to generate the surrogate model (later we will show the application of ground motion selection and scaling for a particular dam site). A random ground motion selection is especially useful because none of the generic dams in this paper are associated with a specific site/location. Moreover, variation of the geometry and material properties in the generic models changes the vibration characteristics of the structures (e.g., fundamental period). Thus, ground motion selection and scaling methods such as spectral matching are not practical methods.

Since a ground motion record has a stochastic nature, one cannot directly use it for classical machine learning regression (unless a time series regression is used which is a complex task). Therefore, it is efficient to extract several meta-features. It is possible to distinguish the ground motion records based on their unique characteristics. A wide range of time-, frequency-, spectral- and intensity-dependent intensity measure (IM) parameters are summarized in Table 1 [58,59]. For each single ground motion signal, fifteen IM parameters are extracted. Figure 5 shows the correlation among fifteen IM parameters for a pilot dam (i.e., a dam that all its shape and material parameters are close to the median of the design space). The lower triangular cells show the one-by-one correlation between 160 ground motion records. The diagonal cells are the histogram of data points. The upper triangular cells are the correlation in terms of Spearman's linear correlation coefficient (R) and *p*-values for testing the hypothesis of no correlation against the alternative hypothesis of a nonzero correlation (P). As seen, the significant duration, *tsig*, has the lowest correlation with other IMs, while the first-mode spectral values have the highest correlation with other IMs.

**Figure 5.** Matrix of fifteen seismic intensity measures for all the applied ground motion records. R: Correlation, and P: *p*-value.


**Table 1.** A list of ground motion IMs [58].

Note: *u*¨(*t*), *u*˙(*t*) and *u*(*t*) are acceleration, velocity and displacement time histories, respectively.

#### **4. Data Structure**

So far, all the design variables including the geometry parameters, material properties, and loading have been discussed. In this section, the finite element software is introduced, and the data structure is explained.

#### *4.1. Software*

The finite element code EAGD [60] is used for dynamic analyses, where the foundation rock is idealized as a homogeneous, isotropic, viscoelastic half-plane. A two-dimensional model is developed including 480 elements which is reasonable for linear elastic systems. The dam-foundation interaction effects are included by adding the dynamic stiffness matrix for the rock region in the dam's equation of motion. This frequency-dependent matrix is defined with respect to the degree of freedom of the nodal points at the dam base [61]. The reservoir water is idealized by a fluid domain of constant depth and infinite length in the upstream direction. The dissipation of hydrodynamic pressure waves by the reservoir bottom materials is accounted for by applying a boundary condition that partially absorbs the incident waves. The system is analyzed based on the following load cases: dam self-weight, water pressure, and the free-field horizontal component of the earthquake ground motion.

#### *4.2. Input-Output Coverage*

Matlab [62] is paired with EAGD to automate the finite element simulations. A total of 24,000 simulations have been conducted which cover a wide range of dam shapes, material properties, and ground motions. Figure 6 shows the data structure. Two side matrices of 160 × 15 × 150 and 15 × 150 are used as inputs in AutoML. The former one is a 3D matrix that identifies the characteristics of the ground motion record. One may note that some of the IM parameters depend on the vibration period of the dam (e.g., *Sa*(*T*1)), and thus, this side matrix is three-dimensional and not two-dimensional. On the other hand, the side matrix which defines the geometry, material, and water level is a two-dimensional matrix as it does not depend on the applied seismic load.

Within the probabilistic simulation framework, and after completing any single finite element analysis, the results are post-processed, and the required information is extracted. Data are stored in the form of a 2D matrix (for scalar quantities) and a 3D matrix (for spatial and temporal quantities).

• Scalar quantities cover the maximum (or minimum) response of the dam at a particular location and the entire duration of the applied ground motion. For example, maximum crest displacement shows the "global" behavior of the dam under the applied motion. Similarly, the maximum first principal stress at the dam heel is a "local" metric that

presents the onset of cracking (if exceeds the tensile strength). Other peaks (i.e., maximum or minimum) response quantities can be extracted from displacement, stress, and strain results.

• Vector quantities cover the responses over time, or they present the spatial distribution of the response parameters. Cumulative inelastic duration (CID) shows the time intervals in which the stress at a particular location exceeds the tensile strength. The overstressed area (OA) illustrates the spatial distribution of regions within the dam body where the tensile strength exceeds the tensile strength (or a multiplayer of it).

**Figure 6.** Data structure.

While it is possible to use as many as output parameters in developing a surrogate model, for any practical implementation, a total of ten response quantities are considered in this paper. They are:


#### **5. Results: Surrogate Model**

In this section, we examine the efficacy of auto-sklearn, explained in Section 2, for developing accurate machine learning-based surrogate models mapping the design variables to the quantities of interests (QoI). Specifically, we consider three scenarios that involve predicting: (1) Out2, i.e., single output, (2) outputs 1 through 6, and (3) all 10 outputs discussed in the previous section. The main intention behind this analysis is to better understand the tradeoff between the number of outputs and the quality of the final surrogate models when using AutoML techniques. We hypothesize that the increase in the number of outputs makes it more challenging to find a model (and its corresponding hyperparameters) that performs well across all the desired outputs. Before stating our results, note that the studied outputs fall within substantially different ranges. Thus, it makes sense to use a linear

transformation technique for each output individually such that all values of a desired quantity of interest will be transformed into the range [0, 1]. That is, the minimum and maximum values of the transformed output will be 0 and 1, respectively. To this end, we use the "MinMaxScaler" method from sklearn, where the transformation is given by (*y* − min)/(max − min) for each output value *y*.

For all experiments, we use three evaluation metrics that are specifically designed for regression problems. The first one is *R* <sup>2</sup> or the coefficient of determination, which is defined as follows for a set of *n* finite element results *y*1, . . . , *y<sup>n</sup>* and their predicted values obtained by a machine learning model *y*ˆ1, . . . , *y*ˆ*n*:

$$R^2(y\_\prime \mathfrak{Y}) = 1 - \frac{\sum\_{i=1}^n (y\_i - \mathfrak{Y}\_i)^2}{\sum\_{i=1}^n (y\_i - \mathfrak{Y})^2},\tag{2}$$

where *y*¯ = <sup>1</sup> *<sup>n</sup>* ∑ *n i*=1 *yi* is the sample mean or average. Therefore, the best possible score is 1 when *y<sup>i</sup>* = *y*ˆ*<sup>i</sup>* for all cases, and the result is 0 when all the predicted values are equal to the sample mean *y*¯, which means that the machine learning model has not learned any patterns. Therefore, *R* <sup>2</sup> values closer to 1 indicate a more accurate machine learning-based surrogate model. The two other evaluation metrics are Root Mean Square Error (RMSE) and Mean Absolute Error (MAE). We can compute these two metrics as follows:

$$\text{RMSE}(y\_\prime \hat{y}) = \left(\frac{1}{n} \sum\_{i=1}^{n} (y\_i - \hat{y}\_i)^2\right)^{1/2} \tag{3}$$

$$\text{MAE}(y\_\prime \hat{y}) = \frac{1}{n} \sum\_{i=1}^{n} |y\_i - \hat{y}\_i|. \tag{4}$$

Based on these formulas, when *y<sup>i</sup>* = *y*ˆ*<sup>i</sup>* for all cases, the return values are zeros, and, overall, values closer to 0 indicate more accurate machine learning or surrogate models.

#### *5.1. Scenario 1: Single Output*

In the first scenario, we just consider Out2 (i.e., maximum principal stress at the heel) and use the default time limit of one hour when using auto-sklearn for model selection (we fix the time limit throughout this section). Table 2 reports the final result, including the types of machine learning models used in the ensemble and their corresponding weights *β<sup>n</sup>* as well as the length of time the model was optimized for (called duration). To interpret this table, note that the rank of each model is based on the calculated value of the loss function that we discussed in Section 2; Rank 1 has the lowest value of the loss. In terms of the ensemble method selected by auto-sklearn, as expected, we can confirm that ∑*<sup>n</sup> β<sup>n</sup>* = 1. Furthermore, we notice that the final model consists of two main types of machine learning models: Gradient Boosting and ARD Regression. Gradient Boosting is a boosting-like algorithm for regression that combines weak learners and the main difference between the models listed in the table is related to critical hyperparameters, such as the maximum depth of the tree and the minimum number of samples required to split an internal node [63]. On the other hand, the most influential model in the ensemble according to the assigned weight is ARD Regression, which can be viewed as a Bayesian extension of linear regression, where the parameters of the regression model are assumed to be in Gaussian distributions. Due to its probabilistic nature, training such models can be time-consuming, as evident from the duration column of Table 2.

Next, we present a parity plot to better understand the performance of the final model using both training and test data sets. This step is crucial because we should show that the final model is not suffering from overfitting, which is a common problem when the selected machine learning model is too complex for the problem at hand. Particularly, Figure 7 plots true values of the quantity of interest obtained via our finite element model versus predicted values produced by the machine learning model that auto-sklearn selected. The reason that we focus on the range [0, 0.3], instead of [0, 1], is that the majority of scaled output values fall within this range. Hence, this allows us to visualize the behavior of the surrogate model more closely in this interesting regime. However, we use the entire data set to report evaluation metrics. In the title of this figure, the accuracy score refers to *R* <sup>2</sup> or the coefficient of determinations, and the best possible score is 1. Therefore, we corroborate that the trained surrogate model performs well on both training and test data sets. In addition, we evaluated the performance of this model on the test data using the other discussed metrics: RMSE = 0.008 and MAE = 0.004, showing the reasonable performance of the final surrogate model.

**Table 2.** The result of model selection performed by auto-sklearn when considering a single output (Scenario 1).


**Figure 7.** Scenario 1 (Out2): plotting true vs. predicted values for both training and test samples. The available data are split into train/test sets according to Figure 1. We see that the surrogate model performs well on both training and test sets (acc in the title refers to *R* 2 ).

#### *5.2. Scenario 2: Multi-Output, Out1 through Out6*

In the second scenario, we consider a multi-output setting, where the objective is to develop a surrogate model to predict six quantities of interest: Out1 through Out6. Similar to the previous experiment, we report the anatomy of the final model, including ranks, ensemble weights/models, and duration, in Table 3. Note that the final model primarily consists of tree-based machine learning models. In fact, Extra Trees and Random Forests are similar in the sense that they both build multiple trees and split nodes using random subsets of features. Therefore, their main goal is to improve the predictive accuracy and control overfitting by constructing ensemble methods. However, as apparent in the table, such methods are computationally expensive. On the other hand, K-Nearest Neighbor methods are much faster when working with data sets consisting of a few thousand data points, which is the case in our study. The number of nearest neighbors, which is the most important hyperparameter, is set to 17 in this example.

Moreover, we provide parity plots in Figure 8, where each subfigure represents actual vs. predicted values for both training and test sets accompanied by *R* <sup>2</sup> values. Similar to the previous scenario, we focus on the range [0, 0.3] (except for Out5 and Out6) because most output values fall within this range. However, Out5 and Out6 take on negative values, and using the transformation technique that we discussed earlier in this section, the majority of data points that have low absolute values fall within [0.7, 1]. This is because negative numbers get smaller as their magnitude increases.

**Table 3.** The result of model selection performed by auto-sklearn when considering six outputs (Scenario 2).


To interpret the reported results in Figure 8, note that Out2 is shared between scenarios 1 and 2, and that the performance of the obtained surrogate model in the second case is slightly worse than the one produced in the first scenario. This is consistent with our hypothesis because the second case study aims to find a model that performs well across all six outputs, instead of a single output. Despite the minor accuracy reduction, we believe that the surrogate model in this scenario is more useful because of predicting multiple outputs at the same time, while the reported *R* <sup>2</sup> values are consistently above 0.9 across all the desired quantities of interest.

**Figure 8.** Scenario 2 (Out1 through Out6): plotting true vs. predicted values for both training and test samples. The available data are split into train/test sets according to Figure 1. We see that the surrogate model performs well on both training and test sets, and the *R* <sup>2</sup> values exceed 0.9 for all the studied outputs.

In addition, Table 4 reports the performance of the machine learning-based surrogate model on the testing data using three metrics. Based on these results, we conclude that the overall performance of our model is satisfactory. RMSE and MAE values are less than 0.01 because the transformed output values are distributed in the range [0, 1]. Moreover, the overall *R* <sup>2</sup> value is about 0.95, meaning that the model has learned useful input-output patterns from the training data set.


**Table 4.** Evaluating the performance of the trained surrogate model using six outputs: Out1 through Out6.

#### *5.3. Scenario 3: Multi-Output, Out1 through Out10*

In this section, we extend our previous analysis to account for the 10 outputs explained in Section 4. Table 5 presents the structure of the final ensemble method identified by using auto-sklearn. This model mainly contains Decision Trees and the K-Nearest Neighbor regression models. From the computational standpoint, optimizing the Decision Tree model is typically more time-consuming than training the K-Nearest Neighbor model. However, from the predictive accuracy viewpoint, Decision Trees are popular because they learn simple decision rules from the available data to approximate a wide range of linear and nonlinear functions, which is helpful when considering various input-output mappings. By carefully reviewing the selected hyperparameters, we noticed that the main difference between the chosen Decision Trees is the minimum number of samples required to split an internal node in the tree (ranging from 3 to 19). On the other hand, the number of nearest neighbors for the two selected models is set to 4 and 17. As a final point, it is interesting to observe that the K-Nearest Neighbor regression model with 17 neighbors was also found in the previous scenario, where we considered 6 outputs.



Furthermore, Figure 9 presents parity plots for the resulting surrogate model when considering all ten outputs, showing actual values obtained by the finite element analysis on the horizontal axis and the corresponding predicted values by the surrogate model on the vertical axis. Comparing the first six outputs with the results from the previous case study, we again notice an insignificant reduction in the predictive accuracy measured by the *R* 2 score. This reduction is reasonable given that the new surrogate model should predict a larger number of outputs compared to the previous case. However, except for Out1 and Out10, the accuracy score exceeds 0.9. To have a more detailed analysis of this surrogate model, we report two other additional metrics (RMSE and MAE) in Table 6. Based on these results, the overall *R* 2 score is above 0.9 and the final RMSE and MAE values are on par with the previous case that we just considered six outputs. Therefore, based on our analysis, we conclude that auto-sklearn is capable of performing model selection in challenging scenarios involving the prediction of multiple quantities of interest at the same time. As a result, auto-sklearn provides an easy-to-use framework for non-experts in

machine learning and practitioners because of eliminating the need to develop multiple independent surrogate models for individual outputs.

**Figure 9.** Scenario 3 (Out1 through Out10): plotting true vs. predicted values for both training and test samples. The available data are split into train/test sets according to Figure 1. We see that the surrogate model performs well on both training and test sets, and the *R* <sup>2</sup> values exceed 0.9 for all the studied outputs except for Out1 and Out10.



#### **6. Results: Dam Design Engine**

Having multiple surrogate models from Section 5, this section discusses the implementation of those in a context of a design engine. In general, a surrogate model aims to

estimate the structural responses as a function of dam shape, material parameters, water level, and applied ground motion:

$$\mathbf{QoI} = \mathbf{g} \left( \underbrace{\mathbf{Mat}\_{1} \cdot \cdots \cdot \mathbf{Mat}\_{7}}\_{\mathbf{Matrial}}, \underbrace{\mathbf{L}\_{1} \cdot \cdots \cdot \mathbf{L}\_{7}}\_{\mathbf{Shape}}, \underbrace{\mathbf{L}\_{w}}\_{\mathbf{Water}}, \underbrace{\mathbf{IM}\_{1} \cdot \cdots \cdot \mathbf{IM}\_{15}}\_{\mathbf{Seismic}} \right) \tag{5}$$

where the input parameters take a range of values for each of the 30 input parameters, and **QoI** is a matrix of outputs (i.e., quantities of interests).

In any practical seismic dam design process, the seismic hazard scenario for which the dam should be designed is known (from probabilistic or deterministic seismic hazard analysis—PSHA/DSHA [64,65]) and is provided to the structural team by a seismologist. Moreover, the basic material properties are also available with good confidence. The foundation rock properties are determined by geologists including the profile of shear wave velocity, mass density, elasticity, permeability, shear strength, etc. [66]. The mechanical properties in concrete are governed by mix design and can be assumed to be known for the feasibility level design [67,68]. Finally, some of the shape parameters are known with good confidence at the feasibility level of design. For example, the total dam height is usually provided to the engineer by the project manager (with inputs from the dam owner, and hydrology team). Therefore, an inverse analysis can be performed on the pre-generated surrogate model using the known variables (specified with an asterisk in the following equation) to estimate the unknown ones:

$$\underbrace{\left[\mathbf{Mat}\_{\sim i}, \mathbf{L}\_{\sim j}\right]}\_{\text{Minkowski variables}} = \mathbf{g}^{-1}\left(\underbrace{\mathbf{L}\_{\text{w}}^{\*} \cdot \underbrace{\mathbf{M}\_{1}^{\*} \cdot \cdots \cdot \mathbf{M}\_{15}^{\*}}\_{\text{Series}} \cdot \underbrace{\mathbf{M}\_{15}^{\*}}\_{\text{Maternal}}, \underbrace{\mathbf{L}\_{j}^{\*} \cdot \mathbf{Q} \mathbf{o} \mathbf{1}^{\*} (\sigma\_{p1, \text{max}}^{\text{level}} \Delta\_{\text{max}} \cdots \cdot)}\right) \tag{6}$$

where index *i* and *j* are known material and shape variables, and ∼ *i* and ∼ *j* are the remaining unknown variables. **QoI**∗ are a series of target responses for the design earthquake.

To explain the technical aspects of the inverse analysis, let us assume that the final surrogate model obtained by auto-sklearn (or other AutoML techniques) takes the form of *g*(*θ<sup>i</sup>* , *θj* , *θ*∼*<sup>i</sup>* , *θ*∼*j*), where *θ<sup>i</sup>* and *θ<sup>j</sup>* represent all known variables that we can treat them as constants and the other two variables are unknown. Here, the goal is to find the "best" choices of *θ*∼*<sup>i</sup>* , *θ*∼*<sup>j</sup>* such that the value of the function *g* gets as close as possible to the target response *g* ∗ . There are two main steps involved in solving this problem: (1) defining a search space, i.e., the set of possible values for the unknown variables, and (2) casting an optimization problem. We denote the search space for each variable type by *θ*material and *θ*shape. With this notation in place, the optimization problem takes the following form:

$$\underset{\theta\_1 \in \theta\_{\text{material}}, \theta\_2 \in \theta\_{\text{shape}}}{\text{arg min}} \ | \text{g}^\* - \text{g}(\theta\_{i\nu}\theta\_{\text{j}\nu}\theta\_{1\nu}\theta\_2) |. \tag{7}$$

Since the size of the search space is finite, we can find the objective function in the above optimization problem for each feasible solution, and then sort them in ascending order. This means that we will have a "ranked" list of possible solutions for the inverse problem.

Design earthquake and ground motion records for dam projects are typically obtained from two main sources [7]: ICOLD and FEMA. While the detailed discussion on ground motion selection and scaling is beyond the coverage of this paper, some major aspects are clarified [69].

• **ICOLD recommendations** There are two basic seismic loads for the design of new dams [70]: Operating Basis Earthquake (OBE) which represents the seismic intensity level at the dam site for which only minor (easily repairable) damage is acceptable and the dam should remain functional. The OBE corresponds to the return period of 145 years (50% probability of exceedance in 100 years). Safety Evaluation Earthquake (SEE) represents the seismic intensity level at the dam site for which a dam must be

able to resist without the uncontrolled release of the reservoir water. The SEE ground motion can be obtained from a probabilistic and/or deterministic seismic hazard analysis. For large and high consequence dams, SEE is defined as (a) Maximum Credible Earthquake (MCE) from DSHA where the parameters should be estimated at the 84th percentile level, (b) Maximum Design Earthquake (MDE) from PSHA corresponding to return period of 10,000 years (1% probability of exceedance in 100 years) [71,72].

• **FEMA recommendations** Time-based performance assessment evaluates a dam's performance over a period considering all earthquakes that may occur in that period, and the probability that each will occur [73]. This procedure follows the following main steps: (a) generate a seismic hazard curve, i.e., *λ* vs. *Sa*(*T*1), (b) compute seismic intensity range and split it into *N<sup>i</sup>* equal intervals, (c) develop a target response spectrum, *S trg <sup>a</sup>* (*T*), for each intensity range, and (d) select and scale suites of *Ngm* ground motions for each spectrum.

Having the scaled ground motions, all the intensity measure parameters listed in Table 1 should be calculated. While the majority of these IMs are structure-independent, some are calculated based on the vibration period of the system (e.g., *Sa*(*T*1)). However, at this stage, the shape (and maybe some of the material properties) are unknown, and thus, direct finite element modeling cannot be used. Instead, a simplified method is used to estimate the initial fundamental period of the dam. The formulation is based on Algorithm 1 originally proposed by [74] by introducing a set of new dynamic compliance coefficients.


**Figure 10.** Standard values for the period lengthening ratio due to hydrodynamic effects, *Rr*.

#### *Example of AutoML Seismic Design*

In this section, we elaborate on the practical implementation of the design engine for a hypothetical site in the central north United States. The objective is to design a straight concrete gravity dam with a 100 m height. The dam is located in a relatively wide valley. The normal water level is provided to be 95 m. According to the concrete mix design, the concrete modulus of elasticity and mass density are 22 GPa, and 2400 kg/m<sup>3</sup> , respectively. The measured shear wave velocity for the top 30 m of rock is about 2250 m/s, and the rock mass density is 2600 kg/m<sup>3</sup> . Therefore, the modulus of elasticity of the rock is estimated to be 35 GPa. For the feasibility level design, a 0.06 and 0.04 constant hysteretic damping is

assumed for the dam and foundation respectively, which correspond to 3% and 2% of viscous damping for each substructure. The combined damping value for the overall dam-waterfoundation system is then larger than damping values measured from field tests on several dams, and it is close to the 90% percentile of the collected data by Chopra [7]. Moreover, the reservoir bottom wave reflection coefficient of 0.9 is used.

Following a probabilistic seismic hazard analysis (which is beyond the coverage of this paper), two seismic hazard levels with 2500 and 5000 years of the return period (RP) are identified for the dam site (denoted as RP2500 and RP5000). The target and individual response spectra for the scaled records are plotted in Figure 11a. The scaled records are also shown in Figure 11b. While the three-component records are typically scaled based on the target spectra, only a single component is shown for this hypothetical example.

**Figure 11.** Uniform hazard response spectra and three scaled ground motions for each hazard level. (**a**) Response spectra. (**b**) Ground motion records (left to right: GM1, GM2, and GM3).

While different objectives can be defined by the design team for the feasibility level design, this example is solely focused on the structural aspects and does not consider the construction cost, as well as risk-informed constraints [13]. This means that the design does not consider the concrete volume used for construction and also the architectural elements. It also does not consider the population at risk in the downstream dam.

We assume that the maximum dynamic tensile stress at the dam heel should be limited to 0.75 MPa under the 2500 years return period scenario. No other constraint is considered in this example. However, it is possible to add more constraints on stress and/or displacement components. Three ground motion records in Figure 11b are processed, and fifteen IM parameters are extracted as discussed in Table 1. Using engineering judgment, the strongest one (i.e., GM1) is used for design. The most comprehensive surrogate scenario is used (see Section 5.3) which is developed based on ten outputs. An inverse analysis is run, and the top 100 design candidates are identified. Figure 12 illustrates a parallel plot that connects all the design variables for the top 100 candidates to five major outputs (including crest displacement, and maximum/minimum principal stresses at the heel and toe). A large variety of *L*<sup>1</sup> (dam base) values have been included for these top 100 design candidates. The ∆*max* varies from 8.5 to 9.5 mm which corresponds to 0.0085–0.0095% of dam height. This is close to a threshold value of 0.01% recommended in [76] for gravity dams. The value of *σ heel <sup>p</sup>*1,*max* is in the [0.81–0.82] MPa range which is close to the threshold value (i.e., 0.75 MPa) previously defined for inverse analysis. The variation in the other three stress quantities is small too.

The small variation of displacement and stress responses for the top 100 design candidates, Figure 12, shows that all these models are more or less acceptable from an MLengine point of view. However, they need to be verified by direct finite element simulations. Therefore, a series of new analyses are conducted using *L*<sup>1</sup> to *L*<sup>7</sup> (and also *Lw*) values in Figure 12 based on GM1 of RP2500, and also the material properties described earlier in this section. The results of direct finite element simulations are collected and compared to those estimated from inverse analysis based on the trained surrogate model. First, the ratio of direct finite element results to the machine learning engine is computed for all top 100 design candidates, as well as five response parameters. Next, a Kernel density function is fitted to each of the five response parameters as shown in Figure 13. As seen,

all four stress responses are centered at one showing that the results of inverse analysis fluctuate around the direct finite element simulation. The range of significant ratio variation is 0.7–1.3 (and for compressive stress up to 1.5). This means that despite the similarity of 100 design candidates from the ML point of view, the direct finite element simulation causes considerable differences among them. In the cases of displacement response, a bias is observed between the direct finite element and machine learning engine, as the former tends to 20% more results (on average).

**Figure 12.** Parallel plot for top 100 design cases including eight shape variables (*L*<sup>6</sup> and *L<sup>w</sup>* are constant) and five response parameters. Each line from left to right is a single design. Delta: ∆*max*; Sp1-heel: *σ heel <sup>p</sup>*1,*max*; Sp2-heel: *σ heel <sup>p</sup>*2,*min*; Sp1-toe: *σ toe <sup>p</sup>*1,*max*; Sp2-toe: *σ toe <sup>p</sup>*2,*min*.

**Figure 13.** Kernel distribution fitted on the ratio of FE to ML-engine results.

So far, Figure 13 showed the variation of the top 100 design candidates specified by the ML engine and tested by direct finite element simulation. Since only a single design should be selected at the end, we have a closer look at the top three candidates provided by the ML engine. Six (unknown) shape variables for each design are listed here:


As seen, only *L*<sup>1</sup> and *L*<sup>7</sup> values change for these top three candidates. Figure 14 illustrates the non-concurrent envelop for the first principal stress for top three candidates. Figure 14a presents *σp*1,*max* only for the dynamic response under RP2500 and GM1. As seen, in all cases *σ heel <sup>p</sup>*1,*max* is about 0.8 MPa which is close to the predefined value of 0.75 MPa. The locations with high dynamic tensile stress are the dam heel and (to some extent) the downstream face in the vicinity of neck discontinuity. Figure 14b shows the same designs; however, the stress results include the static loads too. In this load combination, the tensile stresses of about 0.2 MPa are only limited to the heel. This is somehow consistent with the traditional design approach that specifies no/limited tension in the dam. While the initial design is based on the RP2500 scenario, the candidate models are further analyzed for RP5000. Figures 14c,d present the results of dynamic only and static+dynamic load combinations for the same three candidates. As seen, the dynamic tensile stresses are increased to about 1.2 MPa in all cases, while the combined tensile stress is about 0.7 MPa at the heel (and for the 1st candidate also around the neck). Depending on the design tensile

strength *f* 0 *t* , cracking might be expected at the dam-foundation interface which necessitates conducting a nonlinear simulation (beyond the coverage of this paper).

**Figure 14.** Direct finite element analysis of top three single-constraint surrogate-assisted design candidates (left to right: candidate 1, 2 and 3) based on GM1; Non-concurrent envelope of maximum first principal stresses are shown in MPa. (**a**) RP2500; Dynamic only (*σ heel <sup>p</sup>*1,*max* constraint to 0.75). (**b**) RP2500; Static + Dynamic. (**c**) RP5000; Dynamic only. (**d**) RP5000; Static + Dynamic.

#### **7. Summary**

This paper proposed a finite element-based design engine to assist the practitioners in feasibility level layout development for 2D concrete gravity dams under seismic events. The engine includes a large inventory of gravity dam shapes with different material properties for concrete and foundation rock. Such a large inventory has been subjected to different water levels and many ground motion records. The current database covers about 24,000 unique combinations of dam shapes, material properties, water, and earthquake loading. Automated machine learning (AutoML) tool was used to develop a high-fidelity surrogate model. Next, the surrogate model combined with inverse analysis to design new dams that only few of the design variables are known priori.

Using auto-sklearn as an instance of AutoML techniques that are increasing in popularity, we showed that one could build accurate surrogate models capable of predicting multiple quantities of interest simultaneously. Such models typically form an ensemble containing a rich combination of various machine learning methods, such as Bayesian and tree-based methods. Moreover, we presented a principled way to assess the performance of surrogate models obtained by AutoML to understand the generalization error. Future research directions for this task include (1) a comprehensive analysis of the impact of the time budget in auto-sklearn on the quality of surrogate models, and (2) a comparison with other AutoML techniques, including AutoKeras that allows us to restrict our focus on neural network models.

On the design side, the surrogate-assisted inverse analysis showed promising results in early design of concrete gravity dams. The results of inverse analysis was in very good agreement with test data from same surrogate model. However, there were some differences for data beyond those initially used for meta-modeling. This necessitates increasing the database which covers even more ground motion records. We tested the design engine for a single scenario (i.e., only based on maximum heel stresses); however, a more refined assessment should be performed in future to cover multi-output design scenarios. For future studies, the current engine needs to be validated by other high-fidelity simulations specially for the higher seismic intensity levels. As discussed in the paper, the current seismic design engine is only based on structural analysis results and does not cover the construction complexities and also the failure risk. Those metrics will be integrated with the design engine in future to make it a robust tool for decision makers.

**Author Contributions:** conceptualization, M.A.H.-A.; methodology, M.A.H.-A. and F.P.-A.; software, M.A.H.-A. and F.P.-A.; validation, M.A.H.-A. and F.P.-A.; formal analysis, M.A.H.-A. and F.P.-A.; writing—original draft preparation, M.A.H.-A. and F.P.-A.; writing—review and editing, M.A.H.-A. and F.P.-A.; visualization, M.A.H.-A. and F.P.-A.; supervision, M.A.H.-A. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Data are available from the corresponding author upon reasonable request.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Characterization of Relative Movements between Blocks Observed in a Concrete Dam and Definition of Thresholds for Novelty Identification Based on Machine Learning Models**

**Juan Mata 1,\* , Fabiana Miranda <sup>2</sup> , António Antunes <sup>1</sup> , Xavier Romão <sup>2</sup> and João Pedro Santos <sup>3</sup>**


**Abstract:** Dam surveillance activities are based on observing the structural behaviour and interpreting the past behaviour supported by the knowledge of the main loads. For day-to-day activities, datadriven models are usually adopted. Most applications consider regression models for the analysis of horizontal displacements recorded in pendulums. Traditional regression models are not commonly applied to the analysis of relative movements between blocks due to the non-linearities related to the simultaneity of hydrostatic and thermal effects. A new application of a multilayer perceptron neural network model is proposed to interpret the relative movements between blocks measured hourly in a concrete dam under exploitation. A new methodology is proposed for threshold definition related to novelty identification, taking into account the evolution of the records over time and the simultaneity of the structural responses measured in the dam under study. The results obtained through the case study showed the ability of the methodology presented in this work to characterize the relative movement between blocks and for the identification of novelties in the dam behaviour.

**Keywords:** concrete dam; multilayer perceptron neural network model; structural health monitoring; threshold definition; moving average of the residuals; moving standard deviation of the residuals; DBSCAN

#### **1. Introduction**

Structural monitoring involves observing a phenomenon or event and its impact on the structure. The purpose of the analysis and interpretation of the measurements gathered during an inspection is to enhance the conceptual understanding of the dam's behaviour and aims to define, based on better parameters, eventual models. Once a model is built, the dam's condition assessment is based on test hypotheses and scenario simulations supported by the monitoring data and the prediction of the structural behaviour in space and time. The procedure of providing information for assessing the dam's structural condition is a form of critical analysis capable of reducing the intrinsic uncertainty related to the dam's behaviour.

The assessment of the dam's condition based on the information provided by the monitoring system is possible if the information is updated. Any abnormal behaviour can be readily identified, allowing for the implementation of an appropriate intervention or prevention measures. The assessment of the structural behaviour of the dam and its condition must be performed for each dam independently, even for dams of the same type, since they are influenced by several aspects, such as the heterogeneities in the dam's foundation and the surrounding areas of the dam or the different loads (as a consequence of environmental or operational conditions) that the structure receives. For each concrete

**Citation:** Mata, J.; Miranda, F.; Antunes, A.; Romão, X.; Pedro Santos, J. Characterization of Relative Movements between Blocks Observed in a Concrete Dam and Definition of Thresholds for Novelty Identification Based on Machine Learning Models. *Water* **2023**, *15*, 297. https://doi.org/10.3390/w15020297

Academic Editor: Paolo Mignosa

Received: 6 December 2022 Revised: 4 January 2023 Accepted: 9 January 2023 Published: 11 January 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

dam, different models can be used according to each purpose, to the existing knowledge about the actual structural behaviour, and the quality of information available for the characterization of the structure's behaviour. The selection of the conceptual model to be used for dam surveillance activities must take into account: (i) the purpose of the analysis (safety assessment, prediction of deformations, interpretation of the recorded data from the monitoring system, or analysis of an accident or abnormal behaviour), (ii) the identification of the key factors of the physical problem, and (iii) the available geological and geotechnical information.

During a dam's life, the performance and safety conditions are under continuous assessment due to the potential failure scenarios identified during the design phase and due to the other scenarios "suggested" by the observed behaviour through the analysis of relevant parameters (such as water level and temperature variations, among other things), as seen in Figure 1. Typically, these parameters will describe the loads or operating conditions to which the system is subjected, the materials of the structure, the materials of the structure's foundations, and the structural response of the dam [1]. Thus, visual inspections and measurements collected by the structural monitoring system are fundamental for safety control activities. The main physical quantities measured through the monitoring system are seepage and leakage, uplift pressure, horizontal displacements, vertical displacements, relative displacements between blocks (contraction joint movements), and relative displacements in the rock mass foundation. The interpretation of the observed values is usually based on knowledge related to the physical and chemical phenomena that govern the structure and, whenever possible, is based on deterministic or data-based models. For decision-making, deterministic models are preferred, while in day-to-day activities, data-based models are the most commonly used.

**Figure 1.** Some parameters analysed for the assessment of dam safety and performance.

The most common approaches for data-based models are the HST (hydrostatic, seasonal, time) and the HTT (hydrostatic, thermal, time) models, in which the effects of hydrostatic pressure, temperature, and time are considered additive effects, and their separation is valid [1–5]. The hydrostatic pressure is usually considered a polynomial function of the water height. However, in addition to the pressure from the water, dams in cold regions may be exposed to loads from an ice sheet [6]. The time effect is represented by polynomial functions or by functions of another type (exponential or s-shaped). The thermal effect is considered differently in each approach. In the HST approach, the temperature effect is represented by sinusoidal functions with a one-year period, which are a function of the day of the year only. In the HTT approach, the temperature effect is a function of the measured temperatures. A large number of publications about HST and HTT models can be consulted in the literature [1–15].

Multiple linear regression is one of the most commonly used data-based models [3], showing good performance, mainly for case studies related to the prediction of horizontal and vertical displacements observed on a dam body. On the other hand, new models based on machine learning have been studied and published for about a decade with positive results; these are proposed mainly for the analysis of displacements. Proposed machine learning models could differ in their approach and advantages; such is the case of artificial neural network (NN) models, which principally allow the identification of non-linear relationships between the input quantities, if any. One main aspect to take into account in the case of NN models is that suitable generalization criteria, such as cross-validation criteria, must be adopted to avoid overfitting [5].

There are quantities, such as seepage, uplift pressure, and relative movements between blocks, in which regression models could be used with special care since the separation of effects may not be valid. For example, the behaviour of contraction joints depends on the state of stress installed. The variation observed differs for "open" or "closed" joints. Joint opening and closure are governed by normal stress criteria. The non-linear behaviour presented by the contraction joints can be explained by the non-resistance of the contraction joints to tension. As referred to by Hariri and Kianoush, modelling the joints (contraction, peripheral and lift joints) has an important role in both the static and the seismic analysis of concrete arch dams [16]. Some authors developed a non-linear joint element to represent the behaviour of vertical contraction joints in concrete dams [16–21]. The current knowledge in the field of machine learning allows the proper interpretation and prediction of these quantities, including anomaly detection [22,23], knowledge that is expected to grow in the following years.

The definition of thresholds for novelty identification related to movements in contraction joints is a natural step in the safety control activities of dams. For the identification of damage or gross measurement errors, the limits can be based on deterministic models or even on the maximum values observed in the dam life history affected by a multiplicative factor. However, identifying novelties earlier could be based on narrower limits resulting from data-based models. Nowadays, the definition of thresholds has been mainly based on considering a multiplicative factor associated with the standard deviation of residuals (the part not explained by the model adopted). This analysis allows the point-to-point identification of values that exceed established limits without any temporal or multidimensional context. Thus, two methodologies for threshold definition are presented in this work, one complementary to the traditional approach that can be used for the identification of novelties in a temporal context through the consideration of moving averages and the moving standard deviation of the residuals, and another, in a multidimensional context, through the use of the density-based spatial clustering of applications with noise (DBSCAN) method [24].

The methodology proposed is presented in Section 2. The case study, a concrete dam under exploitation, and the relative movements between blocks registered by the measurements of an automated monitoring system are described in Section 3. The results and final remarks are presented in Sections 4 and 5.

#### **2. Methodology Proposed for the Characterization of Relative Movements between Blocks and for Threshold Definition for Novelty Identification**

#### *2.1. Methodology*

The priority for any threshold definition process is an adequate data-based model with a proper approach. The HTT approach was chosen in this study because of the daily variation observed in the quantity under study (relative movement between blocks). Enough information was obtained from the dam body's thermometer and water level measurement devices. Regarding the machine learning method adopted, the authors selected the MLP-NN method. The methodology for the definitions of thresholds based on the residuals of the data-based models developed from the past history of the records (relative movements between blocks in the case study) is detailed ahead:


The proposed flowchart for the interpretation of the relative movements between blocks and for the operational threshold definition is presented in Figure 2.

**Figure 2.** Proposed flowchart for the definition of thresholds for novelty identification.

The theoretical background related to the MLP-NN algorithm, the moving average, the moving standard deviation of the residuals, and the DBSCAN method are presented in the following subsections.

#### *2.2. Multilayer Perceptron Neural Network Algorithm*

Artificial neural networks have caught the attention of the scientific community since the 1990s [25] due to their ability to learn the pattern of structural behaviour in large infrastructures with a good capacity for generalization [5,14,26–31]. A NN computes a function of the inputs by propagating the computed values from the input neurons to the output neurons using different weights as an intermediate parameter [26]. A multilayer perceptron neural network is a feed-forward algorithm with neurons arranged in layers. It is the most widely used model for cognitive tasks such as pattern recognition and function approximation [5]. The specific architecture of feed-forward networks assumes that all nodes in one layer are connected to those of the next layer. The input layer transmits the data to the output layer going through a set of hidden layers that perform computations that include refining the weights between neurons over many input-output pairs to provide more accurate predictions.

The MLP-NN (Figure 3) learns by an iterative weight adjustment process that enables the correct learning using the training data so that, in a testing phase, it predicts the unknown data. Those weights *w* are located in each connection between the input layer *x* with *N* neurons and the hidden layer *l* with *Q*. The first layer receives the inputs, and the last layer produces the outputs. The middle layer is called the hidden layer. Within it, the information is constantly feed-forwarded from one layer to the next one having two associated values: the input value and the weight. In mathematics and programming, weights are shown in a matrix format *W*, where the number of columns contains the input dimensions (input values and weights), and the number of rows contains the output dimensions (hidden layer neurons *Q*). Another weight associated with the network is the bias *b*. Bias is attached to every layer in the network except for the input layer and carries neurons that try to account for unforeseen or non-observable factors. Activation functions are applied to each of the neurons in the hidden layer. The activation function will map the linear combinations of the inputs and weights to the following layer, in the case presented in Figure 3 from the input layer to the hidden layer and, in a further step, to the output layer of network *L*.

**Figure 3.** MLP architecture where *x* represents the input layer, *l* represents the hidden layer, *L* represents the output layer, *b* represents the bias and *w* represents the weights, adapted from [5].

For regression problems, such as those found in the framework of SHM, the activation functions must be differentiable so that any non-linear models built from them can be derived with respect to their weights [32]. In this case, the activation function from the input layer to the hidden layer *f* can be the logistic sigmoid or the hyperbolic tangent function. The activation function from the hidden layer to the output layer *g* is a linear function.

#### *2.3. Time-Window Threshold Definition Based on the Moving Average and Moving Standard Deviation of the Residuals*

The detection of any deviation not identified by the prediction model between the observed and predicted data, is one of the main objectives of the residual analysis. A moving average control chart of the residuals (*m*.*a*.*r*.) is a type of memory control chart based on an unweighted moving average, and it is defined as

$$m.a.r.\_i = \frac{m.sum.r.\_i}{w} = \frac{\sum\_{j=i-w+1}^{i} r\_j}{w}, i \ge w \tag{1}$$

where *w* is the width of the moving average at time *i*.

In order to describe the dispersion of residuals, the *m*.*sd*.*r*. is also presented in order to know how much the residuals vary or how spread they are along time, being defined as

$$m.sd.r\_i = \sqrt{\frac{1}{w-1} \sum\_{j=i-w+1}^{i} (\mathbf{x}\_j - \mathbf{x}\_{j \text{ to } i})^2} \tag{2}$$

For these periods, the average of all observations within a time window of *w* size up to time *i* defines the moving average. Four different *m*.*a*.*r*., related to 6, 12, 24, and 168 records, correspondent to a quarter, a half, an entire day, and a week, are potential options to be considered in this type of study. However, only the results of a *m*.*a*.*r*. and *m*.*sd*.*r*. with a time window of one day and a time step of one hour are presented in this work.

Once the *m*.*a*.*r*. and the *m*.*sd*.*r*. are established, the definition of a baseline of the structural behaviour under normal conditions is possible. The *m*.*a*.*r*., in this sense, allows the identification of extreme values and trends along a time period. The *m*.*sd*.*r*. allows the identification of any increment of variability and/or randomness along the same time period. Those changes might suggest novelty.

#### *2.4. Multivariate Threshold Definition Based on DBSCAN Algorithm*

Density-based spatial clustering of applications with noise is a clustering algorithm proposed by Ester et al. [24]. It searches for "core objects", points that contain a minimum of observations (MinPoints) within its neighbourhood (defined by an epsilon radius), including the core point itself. If a point is found outside of any of the core object's neighborhood, it is considered noise [33]. Border points are points within reach of a core object without the minimum points in their neighbourhood to be considered core objects themselves (Figure 4). DBSCAN is known to be able to discover clusters with shapes other than linear and be robust enough to handle outliers and noise.

**Figure 4.** DBSCAN illustration with MinPoints = 4 and epsilon distance represented by circles. Core points, such as A, are represented in red, B and C, in yellow, are border points, and N is a noise point, represented in blue. From [34].

The clusters are created as follows: after uncovering a core object (i.e., a point with a high density of neighbours according to the parameters), DBSCAN starts a cluster with it and all its neighbours. All points within the Epsilon distance of the neighbours are then added to the cluster. This process continues until there are no more points within the distance. All the points that do not belong to a neighbourhood (i.e., to a cluster) are considered noise or outliers.

#### **3. Case Study**

#### *3.1. The Feiticeiro Dam*

The hydroelectric development of Baixo Sabor is composed of two schemes. The upstream scheme (termed the Baixo Sabor dam) is 12.6 km away from the Sabor river mouth. The downstream scheme (termed the Feiticeiro dam, Figure 5) is about 3.3 km far from the mouth of the Sabor river [35]. The Feiticeiro dam comprises a concrete gravity dam with an overflow-controlled spillway and a downstream stilling basin. The dam is 45 m high and has a crest length of 315 m, which is divided into twenty-two blocks. The total concrete volume is equal to 130,000 m<sup>3</sup> .

**Figure 5.** Overall view of the Feiticeiro dam.

In accordance with the best technical practices, the monitoring system of the Feiticeiro dam aims at the evaluation of the loads, the characterisation of the rheological, thermal and hydraulic properties of the materials, and the evaluation of the dam's structural response. The monitoring system of the Feiticeiro dam consists of several devices that make it possible to measure quantities such as the concrete and air temperatures, reservoir water level, seepage and leakage, displacements in the dam and in its foundation, joint movements, strains and stresses in the concrete, and pressures, among others [36–38].

The system used for the measurement of the reservoir water level comprised a highprecision pressure meter, which provides a record of the water height over time, and a level scale. The air temperature and humidity were measured in an automated weather station placed on the right-side bank approximately 100 m apart from the dam crest.

The concrete temperature was measured by 76 electrical resistance thermometers distributed across a dam thickness of several blocks. The location of the thermometers was defined taking into account the set of other electrical resistance devices (strain gauges, embedded jointmeters and strain gauges) that also allowed for the measurement of the concrete's temperature.

Displacements were measured using an integrated system that included 3 pendulums, 8 rod extensometers, and geodetic observations. The relative movements between blocks were measured by superficial and embedded jointmeters.

The deformation of the concrete was measured with electrical strain gauges arranged in groups and distributed in radial sections, allowing the determination of the stress state through the knowledge of the deformation state and of the deformation law of the concrete. The quantities of drained and infiltrated water were measured individually in drains of the drainage system installed in the dam foundation and in weirs that differentiated the total quantity of water that flows in the drainage gallery in several zones of the dam. The drainage system comprised a set of 57 drains distributed over the drainage gallery with 3 drains per block. All the water extracted from drains and leakages was collected in 4 weirs.

The measurement of the uplift pressure at the foundation was performed by a piezometric network that comprised 26 piezometers. A real-time data acquisition system for the auscultation instrumentation (ADAS) was installed at Feiticeiro dam, allowing the measurement of the following physical quantities [39]:


The responses under study are variations of the relative movements between blocks (at the higher level) measured by 7 automated jointmeters. The water level and the concrete temperature are measured by several thermometers located in the dam body, and they represent the main environmental loads studied in this case.

The mathematical calculations and the graphic representations presented ahead were supported by the R project software [40–43].

#### *3.2. The Analysed Data*

In this case study, the daily variation of the opening-closing movements between blocks measured at the 134.2 m level and through 3D jointmeters were analysed (jointmeter designed by BT3, BT5, BT7, BT8, BT10, BT12, and BT13).

The location of the jointmeter part of the automated monitoring system is shown in Figure 6. The data analysed correspond to a period between April 2017 and June 2022, with more than 38,660 records per variable. Measurements from the automated monitoring system are taken every hour. In the case of gaps in records, the values were estimated by interpolation between consecutive records. The manual and the automated measurements of the opening-closing movements were compared, showing the good performance of the two measurement systems (this comparison is not part of this study).

The samples regarding the relative movements between blocks were collected every two weeks for manual measurements and every hour for automated measurements, as seen in Figure 7. Signs (+) indicate opening movements, and signs (−) indicate closing movements.

**Figure 6.** Feiticeiro dam. Thermometers and 3D jointmeters included in the automated monitoring system. Legend: MWL—Maximum reservoir water level.

**Figure 7.** Time series of the relative movements between blocks measured between April 2017 and June 2022.

Once the measuring devices to be studied in this work were installed in the downstream face of the dam body (a visiting gallery at the level of 134 m is nonexistent), they measured the relative movements between blocks in a zone that is strongly influenced by the daily variations of the temperatures, as shown in Figure 8, regarding the measurements between the 1st and the 8th of August 2018.

**Figure 8.** Time series of the relative movements between blocks measured between the 1st and the 8th of August 2018.

Among the different loads acting on concrete dams, it is typical to distinguish, as the most important loads for structures in normal operations, the hydrostatic pressure and the temperature variation. The time evolution of the reservoir water level is presented in Figure 9.

**Figure 9.** Time series of the reservoir water level measurements between April 2017 and June 2022.

The temperature variations observed in the dam body were recorded through thermometers embedded in the dam body (blocks J7–J8, J11–J12, and J15–J16). The relative position of the thermometers are presented in Figure 6 and Table 1, and their records are those shown in Figure 10.


**Table 1.** Relative location of the thermometers in the dam body.

Notes: u.f.—upstream face, d.f.—downstream face, oth—other, th—thickness.

**Figure 10.** Temperatures in the dam body recorded between April 2017 and June 2022.

The observations presented in Figures 7–10 will be used for the computation of the models presented in this work.

Based on the evolution of the time series of the joint movements, water level, and temperature variations, it is expected that the thermal effect is more significant when compared with the water level effect. Figure 11 presents an example of the temperature and the water height in the structural response for the case study. Figure 11 (left) shows the evolution of opening-closing movement measures in the jointmeter BT3 vs. the temperature measures in the thermometer T37. As expected, the movement between the blocks is in the closing direction when the temperature increases. Figure 11 (right) shows the evolution of the opening-closing movement measures in the jointmeter BT3 and the water height when the temperature measured in the thermometer T37 is 10 ◦C. The opening movement increases slightly with an increase in the water height.

**Figure 11.** Opening-closing mov. measured in BT3 vs. temperature measured in T37 (**left**). Openingclosing mov. measured in BT3 vs. water level for temperature measured in T37 equal to 10 ◦C (**right**).

#### **4. Results and Discussion**

#### *4.1. Model Formulation, Construction, and Performance*

The MLP-NN model based on an HTT approach considered, as an input layer, 15 parameters (representing the hydrostatic pressure—*h*, *h* 2 , *h* 4 , where *h* is the reservoir water level that can vary between 0 and 45 m—and the temperature effects measured at T10, T11, T15, T16, T31, T32, T33, T35, T62, T64, T65 and T66), seven responses at the output layer (representing the opening-closing movements between blocks measured at BT3, BT5, BT7, BT8, BT10, BT12 and BT13), and one hidden layer. Every neuron in the network is fully connected.

A hyperbolic tangent transfer function was chosen as the activation function for the hidden layer, and a linear activation function was chosen for the output layer. The generalized backpropagation delta learning rule algorithm was used in the training process. To find the optimum result (through the minimization of a cost function defined by the mean squared error), 5 initializations of random weights and a maximum of 1500 iterations were performed for each MLP-NN architecture.

A randomization of the learning set was previously carried out, making it possible to define the training set, the cross-validation set and the test set, with a number of examples equal to 65%, 15%, and 20%, respectively. The cross-validation was used as the stopping criteria. In each iteration, the performance for the training set is usually better than before, but if at any time the error for the cross-validation set increases, the NN model may lose its generalization capacity. The training stops when the error for the cross-validation set begins to increase, with a better generalization thus being ensured [5]. The test set was used as an auxiliary element that enabled us to carry out the quality evaluation of the MLP-NN model for the training set. In this case study, the network with the best performance was a 15-25-7 MLP-NN (less error for the cross-validation set). The results are represented in Figures 12–14 and described in Table 2.


**Table 2.** MLP model performance parameters.

Notes: *r*—residual, *sd*—standard deviation, *R* <sup>2</sup>—coefficient of determination.

**Figure 12.** Time series of the fitted values and recorded values for the movements measured through 3D jointmeters.

**Figure 13.** Time series of the relative movements between block measurements and fitted values between the 1st and the 8th of August 2018.

#### *4.2. Threshold Definition for a Singular Record*

The traditional approach adopted for threshold definition in regression models consists of adopting a multiplicative factor associated with the standard deviation of the residuals. Usually, multiplicative factors equal to 3 or 4, corresponding to a confidence level of 99.73% and 99.99%, respectively, are adopted. This approach assumes that the residuals follow a normal distribution (an assumption that is often not verified for real cases but that allows a good approximation). The same type of criterion can also be used in the MLR-NN model following the limits shown in Figure 15.

**Figure 15.** Residuals of the movements measured through 3D jointmeters.

#### *4.3. Threshold Definition for Novelty Identification Based on a Time Period Evolution of the Residuals*

In case a novelty is identified, verifying if it is an isolated record is relevant in order to avoid the false internal warning since, usually, isolated novelties are associated with measurement errors. A critical aspect of dam safety control activities is the pattern recognition in the observed behaviour, including their expected evolution over time. With the exception of extreme load events and other time effects (such as the existence of internal expansion reactions in the concrete), the pattern in the observed behaviour is expected to be continuous over time. Thus, one way to identify novelties in the dam's behaviour is to analyze the evolution of this behaviour (through the residuals) in a time window along time. Since daily variations resulting from daily temperature changes are observed, a moving window with 24 hourly records and with a time step equal to the measurement frequency (1 h) was adopted.

Figure 16 shows the evolution of the moving average of the residuals (*m*.*a*.*r*.) and of the moving standard deviation of the residuals (*m*.*sd*.*r*.) in a time window of one day over time.

**Figure 16.** Residuals, *m*.*a*.*r*., and *m*.*sd*.*r*. over time.

The simultaneous analysis of the *m*.*a*.*r*. and *m*.*sd*.*r*. allows us to identify if the predicted values are moving away from the recorded values (i.e., if the residuals are increasing, meaning that the non-explained part of the model is increasing), and if the residuals increase along time, this could suggest a damage evolution. The definition of thresholds based on multiplicative factors associated with the standard deviation of the residuals is also proposed in this analysis. Similar to the single record analysis, thresholds based on multipliers of 3 and 4 for the standard deviation of the residuals are also proposed, although these values can be updated depending on the results, as seen in Figure 17. The priority in this kind of analysis is to identify periods in which the residuals and their evolution are higher; consequently, a deeper analysis of the data can be carried out.

**Figure 17.** *m*.*sd*.*r*. vs. *m*.*a*.*r*. for each quantity.

#### *4.4. Threshold Definition for Novelty Identification Based on Multivariate Data*

Both of the previous analyses consider one physical quantity at a time. However, considering several quantities at the same time is relevant for an integral judgment. To address this consideration, a threshold definition based on the DBSCAN method is used. By definition, the residuals should follow a distribution similar to a normal distribution with a zero mean. Based on this premise, a criterion for the identification of spread records (in the *n* features dimension—seven in this case study) that are far from most of the records is defined. An analysis of the effect of the parameters Epsilon and MinPoints were performed as presented in Table 3. Values of Epsilon = 0.12 and MinPoints = 7 were adopted for this case study, as seen in Figure 18. These values were defined based on the specialist criteria considering the standard deviation of the residuals.

**Table 3.** Number of potential novelties based on several Epsilon and MinPoints values.


**Figure 18.** Novelty identification based on a multivariate analysis based on the DBSCAN method (adopting Epsilon = 0.12 and MinPoints = 7).

The dam behaviour in the time period under study presents a good performance, and there is no abnormal behaviour. The monitoring system records were used for defining the thresholds based on the three proposed procedures. Based on the thresholds defined and for future measurements, the records above the thresholds are the potential candidates to represent a novelty. They should be assessed in order to identify whether they are related to a measurement error or due to other situations.

The analysis of the monitoring data gathered from the horizontal displacements in a concrete dam's behaviour and the adoption of data-based models to interpret the structural behaviour are common in day-to-day dam surveillance activities.

#### **5. Conclusions and Final Remarks**

The analysis of relative movements between blocks through data-based models is unusual due to the nonlinearity of the observed data, which cannot be represented in traditional linear regression models. In this case, a multilayer perceptron neural network (MLP-NN) model was developed to interpret the observed relative movements between blocks. The residuals (the part not explained by the model) resulting from the model were used in this work to define thresholds for novelty identification. The explanatory capacity of the residual analysis allows the definition of a more accurate baseline for the characterization of the structural behaviour. In this work, the analysis of the relative movements between blocks was performed based on hourly recorded measurements. The main inputs considered are the function of the water level and the temperatures measured in several thermometers spread along the dam body. The proposed procedures allow for the earlier detection of novelties through the analysis of the residuals of the MLP-NN prediction model, adopted for the interpretation of the relative movement between blocks, for (i) a singular record, (ii) a moving time period, and (iii) multivariate records. The main issues related to the definition of thresholds can be summarised as follows:


To be effective, dam safety control activities must be considered an ongoing process. In this sense, applying the proposed procedures for operational issues in different dams and considering different quantities is suggested. It is important to highlight that each procedure has its singularities and challenges; the observed structural behaviour, the performance of the model adopted, and the quality and frequency of the measurement recorded need to be considered for the adoption of an adequate multiplicative factor for univariate thresholds and the adoption of adequate parameters for multivariate thresholds. The definition of the moving window size for calculating the moving average and the standard deviation depends on the measurement frequency of the records and the reservoir's exploitation regime. Despite these challenges, the implementation of any of the proposed procedures in an automated monitoring system can adequately support dam surveillance activities.

**Author Contributions:** Conceptualization, J.M. and F.M.; methodology, J.M. and F.M.; software, J.M., F.M. and A.A.; validation, A.A., X.R. and J.P.S.; formal analysis, J.M., F.M., A.A., X.R. and J.P.S.; investigation, J.M., F.M. and A.A.; data curation, J.M. and A.A.; writing—original draft preparation, J.M. and F.M.; writing—review and editing, J.M., F.M., A.A., X.R. and J.P.S.; visualization, J.M. and F.M.; supervision, J.M., X.R. and J.P.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** The contribution of the second author's work was funded by the Portuguese Foundation for the Science and Technology (FCT), who funded this research through the grant number PD/BD/150407/2019.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Restrictions apply to the availability of these data. Data were obtained from Movhera—Hidroelétricas do Norte, S.A and Engie—Hidroelectricas do Douro, Lda, and are available from the authors with the permission of both of these entities.

**Acknowledgments:** The authors acknowledges the Movhera—Hidroelétricas do Norte, S.A and Engie—Hidroelectricas do Douro, Lda that provided the data for the procedures addressed in this paper, as well as LNEC through its research program RESTATE (0403/112/20970). The second author acknowledges the financial support of the Portuguese Foundation for the Science and Technology (FCT) through the grant PD/BD/150407/2019. The fourth author also acknowledges the financial support of the Base Funding—UIDB/04708/2020 of CONSTRUCT—Instituto de I&D em Estruturas e Construções, funded by national funds through FCT/MCTES (PIDDAC).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

## *Article* **Risk-Informed Design of RCC Dams under Extreme Seismic Loading**

**Keith A. Ferguson**

Senior Technical Advisor for Dams and Hydraulic Structures, HDR Engineering, Inc., Suite 3400, Denver, CO 80220-4824, USA; keith.ferguson@hdrinc.com

**Abstract:** The existing Scoggins Dam and reservoir are in Washington County, Oregon, and the title is held by the U.S. Bureau of Reclamation (Reclamation). Reclamation has previously identified dam safety concerns related to the existing embankment dam. Regional project sponsors, including Clean Water Services, have identified the need for expanded storage capacity in the reservoir to meet growing water demands and address water quality issues in the Tualatin River downstream of the dam. As part of efforts to resolve dam safety issues and increase the water storage in the reservoir, a comprehensive feasibility level design of a new 185-foot-high Roller Compacted Concrete (RCC) dam. Extraordinary seismic hazards have been identified in the region associated with the Cascadia Subduction Zone (CSZ). Further, any dam alternative carried forward for funding, final design, and construction will have to meet the Public Protection Guidelines (PPG) of Reclamation that require a formal quantitative risk analysis. A risk-informed design approach was adopted to configure the layout and cross-section properties of the dam. A multi-phase site characterization program and preliminary RCC mix design program were performed to support the design. In addition, models were developed, and an extensive suite of both (two-dimensional) 2D and (three-dimensional) 3D structural analyses were performed for seismic loadings with total durations of over 200 s, strong shaking of over 140 s, and peak ground accelerations of over 2 gravitational accelerations (g) (up to 50,000-year return period event). This paper describes the feasibility design configuration of the dam, including the seismic hazard characterization, structural analysis models, and seismic response modeling results. The expected performance of the dam relative to the risk-informed design criteria and Reclamation PPGs will be generally described.

**Keywords:** roller compacted concrete (RCC); risk-informed design; Cascadia Subduction Zone (CSZ); non-linear structural analysis

#### **1. Introduction**

Hagg Lake reservoir, impounded by Scoggins Dam (Figure 1), is a key water resources facility for a range of water providers in Washington County, Oregon. The dam was completed in 1975. It is located on Scoggins Creek, about 5 miles (8 km) southwest of Forest Grove, Oregon and 25 miles (40.25 km) west of Portland, Oregon. The estimated existing storage capacity of Hagg lake is 53,640 acre-feet (>66,000,000 cubic meters [m<sup>3</sup> ]) at the top of the current active conservation pool elevation of 303.5 feet (92.5 m). An enlarged reservoir has been under consideration as a central feature of the Tualatin Basin Dam Safety and Water Supply Joint Project. The new RCC dam holding the enlarged Joint Project reservoir presented in this paper would be located downstream of the existing embankment dam and have a maximum structural height of 180.5 feet (55 m), an increased storage capacity of up to 50,000 acre-feet (61,674,000 m<sup>3</sup> ) and a total maximum storage of 103,640 acre-feet (127,838,000 m<sup>3</sup> ).

**Citation:** Ferguson, K.A. Risk-Informed Design of RCC Dams under Extreme Seismic Loading. *Water* **2023**, *15*, 116. https:// doi.org/10.3390/w15010116

Academic Editors: M. Amin Hariri-Ardebili, Fernando Salazar, Farhad Pourkamali-Anaraki, Guido Mazzà and Juan Mata

Received: 27 October 2022 Revised: 19 December 2022 Accepted: 21 December 2022 Published: 29 December 2022

**Copyright:** © 2022 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

**Figure 1.** Project location map. **Figure 1.** Project location map.

The feasibility design described in this paper was based on the results of appraisallevel designs (Reclamation, FAC 09-01) [1], where the RCC dam concept was originally examined as a possible project that could address both dam safety and increased water supply project objectives. A straight gravity plan and cross-section were developed, structurally modeled, and then subjected to a full quantitative risk analysis (QRA). The QRA demonstrated that the gravity dam section could be designed to withstand the very large seismic hazards at the site and that some optimization would likely be possible. One of the most significant optimizations identified during the QRA was curving the dam in a plan to activate arch action during higher seismic events that could cause cracking and potential sliding of the dam. The corresponding cross-section and curved gravity alignment were verified during feasibility design based on a risk-informed design approach. The feasibility design described in this paper was based on the results of appraisal-level designs (Reclamation, FAC 09-01) [1], where the RCC dam concept was originally examined as a possible project that could address both dam safety and increased water supply project objectives. A straight gravity plan and cross-section were developed, structurally modeled, and then subjected to a full quantitative risk analysis (QRA). The QRA demonstrated that the gravity dam section could be designed to withstand the very large seismic hazards at the site and that some optimization would likely be possible. One of the most significant optimizations identified during the QRA was curving the dam in a plan to activate arch action during higher seismic events that could cause cracking and potential sliding of the dam. The corresponding cross-section and curved gravity alignment were verified during feasibility design based on a risk-informed design approach.

#### **2. A Review of Seismic Design and Dam Safety Risk Analysis of Concrete Dams 2. A Review of Seismic Design and Dam Safety Risk Analysis of Concrete Dams**

Seismic design and analysis of existing or new concrete dams is a challenging task and requires consideration of multiple performance factors, including dam-foundationwater interaction, proper seismic analysis, and realistic modeling of nonlinear and damage behavior of concrete [2]. In the case of new concrete dams, consideration of material property variation with time and thermal effects must be incorporated into the design. While there has been tremendous research on the seismic "analysis" of existing concrete dams and hydraulic structures against natural hazards, there was limited attention on the seismic "design" of new dams [3], particularly in high seismic regions and for dams for which failure may impact large populations downstream of the reservoir. Traditionally, concrete gravity dams have been designed by a force-based procedure Seismic design and analysis of existing or new concrete dams is a challenging task and requires consideration of multiple performance factors, including dam-foundationwater interaction, proper seismic analysis, and realistic modeling of nonlinear and damage behavior of concrete [2]. In the case of new concrete dams, consideration of material property variation with time and thermal effects must be incorporated into the design. While there has been tremendous research on the seismic "analysis" of existing concrete dams and hydraulic structures against natural hazards, there was limited attention on the seismic "design" of new dams [3], particularly in high seismic regions and for dams for which failure may impact large populations downstream of the reservoir.

[4,5]. In this method, the forces associated with the weight of the dam are obtained as a product of a seismic coefficient, and the weight of the portion of the dam being considered is statically applied to the dam. These forces were also combined with hydrostatic and simplified hydrodynamic pressures [6]. The main disadvantages of this method are that the dynamic characteristics of the dam and the applied ground motions are ignored. Therefore, for new concrete dam design, a state-of-the-art method is needed to over-Traditionally, concrete gravity dams have been designed by a force-based procedure [4,5]. In this method, the forces associated with the weight of the dam are obtained as a product of a seismic coefficient, and the weight of the portion of the dam being considered is statically applied to the dam. These forces were also combined with hydrostatic and simplified hydrodynamic pressures [6]. The main disadvantages of this method are that the dynamic characteristics of the dam and the applied ground motions are ignored.

come the limitations of the traditional standards-based seismic design methods. This paper proposes the application of risk-informed design in concrete dam engineering. In the following sub-sections, a brief review of the literature is provided on (1) different design Therefore, for new concrete dam design, a state-of-the-art method is needed to overcome the limitations of the traditional standards-based seismic design methods. This paper proposes the application of risk-informed design in concrete dam engineering. In the following sub-sections, a brief review of the literature is provided on (1) different design

philosophies in earthquake engineering and (2) the evolution and application of risk-based procedures in dam engineering.

#### *2.1. Seismic Design Philosophies*

A key factor in the design process for modifying an existing or construction of a new concrete dam or major concrete hydraulic structure is the identification of design objectives and criteria. Over the past 30 to 40 years, there has been significant advancement of analysis capabilities, the widely accepted use of RCC for both rehabilitation of existing dams and design of new dams, and continuing development of different design methods utilizing advanced numerical modeling and updated codes, standards, and industry guidelines. Performance-based seismic design [7] allows for the desired level of seismic performance for the structural systems when they are subjected to a specific level of ground shaking. Multiple levels of ground shaking can be evaluated, with a different level of performance specified for each shaking level. Reliability-based seismic design [8] uses the mean values of the random system parameters as design variables and optimizes the objective function subject to pre-defined probabilistic constraints. Risk-based seismic design [9] uses safety as the main objective. A risk-based design considers composition, intended use, materials, actual use, environmental issues, and ultimate decommissioning of the structure. It can be complex, requiring knowledgeable teams and schedules that include risk analysis throughout the design process. More recently, the concept of resilience-based seismic design [10] has been introduced. It is the next generation of performance-based design procedures. In this approach, the interaction of the structure with local and regional communities is accounted for. The structure should not be considered alone but as part of a group of infrastructure requiring the use of a portfolio approach which provides for incorporation of regional loss analysis.

The implementation of the above-discussed design philosophies in concrete dam engineering is usually discussed within the so-called "shape optimization" task by Ramakrishnan and Francavilla [11]. It is an iterative procedure that combines advanced structural optimization theories with specific objective functions that are defined by engineers for the gravity or arch dams [12,13]. In some cases, the optimization algorithms are combined with machine learning methods to accelerate the design process and reduce the computational burden [14,15].

#### *2.2. Risk-Informed Decision-Making (RIDM) in Dam Engineering*

Risk can be defined as the "measure of the probability and severity of an adverse effect to life, health, property, or environment". In the general case, risk is estimated by the combined consideration of loading scenario(s), the system (dam or structure) response to loading, and the associated consequence of failure [16]. The risk concept can be interpreted either as an individual risk or the overall risk of the system with different dimensions. Risk management encompasses activities related to making risk-informed decisions, prioritizing evaluations of risk, prioritizing risk reduction activities, and making program decisions associated with managing a portfolio of facilities.

Risk assessment is the process of deciding whether existing risks are tolerable and whether present risk control measures are adequate, and if not, whether alternative risk control measures are justified or should be implemented. Risk assessment incorporates the risk analysis and risk assessment phases. Tolerable risk means different things to different people and organizations. Some focus on economic risks to their companies or organizations (e.g., insurance, offshore oil, and gas), while others focus on the loss of life. Most of the regularities use a "risk curve" [17], either in the form of *f*-N or F-N chart.

The introduction and subsequent adoption of risk analysis into the dam safety community of practice in the United States was a complex journey [18]. While a complete treatment of this timeline is beyond the scope of this paper, notable milestones should be mentioned. First, a visionary look at risk related to dams was first introduced to the dam engineering community of practice in 1964 by Casagrande [19]. Following the failure of the

Teton dam on June 5, 1976, the creation of federal dam safety guidelines was ordered by President Carter in 1977 [20]. In response to the Teton dam failure, Reclamation passed its own dam safety act in 1978 [21], and the Federal Emergency Management Agency (FEMA) was identified as the lead federal agency for oversight of dam safety in 1979 [22]. The combined 1978 Reclamation Act, along with the issue of the first federal guidelines for dam safety in 1979 [23], laid the groundwork for Reclamation's pioneering use of risk for their dam safety program. The foundational tools for performing a risk analysis of dams were introduced in 1981 by Whitman based on research in the MIT Civil Engineering program [24]. Reclamation issued its first Public Protection Guidelines in 1997 [25]. Dam safety guidelines were subsequently issued by the U.S. Army Corps of Engineers (USACE) [26] and the Tennessee Valley Authority (TVA) [27] in 2011; a joint dam safety guideline for risk analysis by all federal agencies was issued by FEMA in 2015 [28], and most recently by the Federal Energy Regulatory Commission (FERC) in 2021 [29], promulgated regulations for dam safety risk analysis for all licensed hydropower projects with dams. Most state agencies in the United States are now considering the adoption of risk analysis guidelines for their dam safety programs.

A comprehensive review of the risk analysis of dams has been provided by Hariri-Ardebili [30] and Hariri Ardebili and Nuss [31]. Stedinger et al. [32] wrote a report explaining the concepts needed to perform a thorough probabilistic analysis of the dam safety issues. Bowles et al. [33] reviewed the relationship between the standard- and risk-based approaches for dam safety in the context of comprehensive risk management. Bowles [34] explains the step-by-step procedure for portfolio risk assessment of dams. Chauhan and Bowles [35] presented a framework for uncertainty analysis in dam safety risk assessment, including an approach to incorporating input uncertainties into the risk analysis model. Peyras et al. [36] proposed a method to support dam aging diagnosis and risk analysis that capitalizes on an expert's knowledge and feedback. Smith [37] proposed a model for dam risk analysis based on Bayesian networks. Serrano-Lombillo et al. [38] proposed a technique to calculate incremental risks in the context of an event tree. Castillo-Rodrıguez et al. [39] proposed a variation of the combined risk analysis approach for complex damlevee systems, which was based on the event tree analysis from multiple combinations of "load-system response-consequence" events. More recently, [40,41] combined the natural hazard-based assessment of dams with pandemic constraints and complex emergencies and presented the concept of a multi-risk-based framework for dam safety.

The use of risk analysis and assessment in dam safety, as summarized above, has been focused on the safety of existing dams and decisions to modify those dams to reduce risk to tolerable levels. USACE has recently issued some guidance related to risk-informed design [42]; however, this guidance is very high level, is focused on four high-level tolerable risk guidelines (TRGs), and is being mostly applied to existing dams and levees. Overall, USACE states that "the objective of risk-informed design is for risks to be tolerable for the final project, including the associated floodplain management practices". To meet this objective, teams performing designs must demonstrate that the completed design configuration will meet USACE risk tolerance guidelines portrayed on *f*-N or F-N charts provided in ER 1110-2-1156 [43].

#### *2.3. Objectives and Contributions*

The objective of this paper is to present a framework for the risk-informed design of new RCC dams, particularly in high seismic hazard regions, that will provide adequate long-term safety when evaluated under Reclamation's PPGs expressed on Reclamation's *f*-N chart. Such a design would also meet the Federal risk tolerance guidelines [28] as well as the similar guidelines of the USACE, TVA, and FERC. In a high seismic region such as the northwestern United States, including the location of Scoggin's dam, this was a very challenging task. Early structural modeling as part of an appraisal-level design provided an indication that the cross-section of the dam would need substantial adaptation from a traditional gravity section, and the configuration of the dam would need to be changed to a curved gravity dam to result in adequate structural response and overall level of dam safety. To meet Reclamation's dam safety guidelines, a set of four design criteria were established against which the design configuration of the dam would be developed and subsequently evaluated. By meeting those criteria, the design team was confident that the design would have tolerable safety risks based on a full Quantitative Risk Analysis (QRA) performed by an experienced risk estimating cadre under the supervision of a well-qualified and experienced risk analysis facilitator at the completion of the feasibility design. Having tolerable risks would mean that a substantial redesign of the dam would not be needed and that the feasibility level design could be compared to a companion design of modifications to the existing embankment dam for the purpose of making a final decision on the preferred project configuration to move forward into the final design.

The contribution of this paper is that of a completed feasibility design of a new RCC dam in a high seismic hazard region for which a full QRA completed by an experienced risk estimating team showed the design would meet federal dam safety guidelines for tolerable risk. The four risk-informed design criteria established as the basis for the design are believed to have much broader applicability within the dam safety community of practice for other new RCC dams and/or major hydraulic structures in high seismic regions with high potential life loss consequences if failure were to occur.

#### **3. The Proposed Concept of Risk-Informed Design for New RCC Dams**

The adoption of risk analysis as the framework for dam safety in the United States is challenging engineers to understand how existing facilities will respond to a large range of static, hydrologic, and seismic loading conditions. Risk analysis must consider the response of the reservoir system (including the dam, spillway, outlet works, and foundation treatments) for a wide range of loadings up to and including the onset of damage states, progressive damage development, and ultimately failure of the structure while at the same time considering the consequences of dam failure as first suggested by Casagrande in 1965 [19].

As will be shown, the northwest United States is a region of very large potential earthquakes associated with the CSZ capable of generating M9 earthquakes with unprecedented ground motion intensity and duration. To meet the requirements of risk-informed dam safety and design, engineers must understand the potential consequences of dam failure (life loss and other economic and environmental impacts), develop models, perform analyses, and adopt designs for seismic loadings beyond the range of documented loadings that have occurred at existing dams around the world. Hence, significant judgment is required in interpreting the results of modeling for which there is limited to non-existent calibration of the models to the actual performance of existing dams under very high loading conditions.

The concept of risk-informed design, as presented in this paper, requires adopting design criteria for four distinct performance domains: (1) where the linear (undamaged) response is expected, (2) where the onset of non-linear behavior (initiation of damaged states) occurs, (3) where an acceptable and limited amount of damage and related performance of the damaged structure will occur at the extreme end of loadings that risk analysis requires to be considered, and (4) where a final damage state for which the post-earthquake performance of the dam and appurtenant structures would be acceptable and not result in catastrophic failure and loss of life consequences.

#### **4. Regional Seismic Hazards**

Seismic hazards at the Scoggins site have been developed by Reclamation through site-specific evaluations. The hazard is generated from three separate potential sources, including (1) local (crustal) faults such as the Gales Creek fault system, (2) intraplate, and (3) intraslab. The intraplate and intraslab hazards are associated with the CSZ off the Oregon coast.

#### *4.1. Probabilistic Seismic Hazards*

Example hazards from the probabilistic seismic hazard assessment (PSHA) at the Scoggins dam site are summarized in Table 1. The table includes the return period and corresponding estimated peak ground acceleration (PGA) associated with the CSZ events as well as the total hazard PGA at the site.


**Table 1.** Probabilistic seismic hazards at the Scoggins Dam Site.

Note(s): g = gravitational acceleration.

#### *4.2. Representative Time Histories*

Time histories for use in structural response analyses and evaluations were created for the Scoggins site by Reclamation in 2012 and published in a Technical Memorandum in 2013 [44]. Seed records were scaled to the conditional mean spectra (CMS) that matched the uniform hazard spectra from the PSHA at either a 0.2 s period (short period [SP]) or a 0.75 s period (long period [LP]). Both were performed expecting SP motions to be more detrimental on rigid structures (RCC dam, spillway, outlet works) and LP motions to have a more detrimental effect on the embankment alternative being evaluated for the project. For the RCC dam structural analyses, one-time history was developed to represent an event produced by a Gales Creek Fault system rupture (Record MYG008 from the M7.2 f2005 Miyagi-Oki earthquake) and scaled to the SP CMS with a duration of about 40 s. Other seed records were considered from either the 2011 M9.0 Tohoku interface earthquake or as synthetic records matching the SP or LP CMS. These records represented an M9.0 interface earthquake having a total duration of over 240 s (period of strong shaking from 100 to 120 s) and included x, y, and z ground motion components. Ultimately, a synthetic time history SRCH10-SP, along with the MYG008 motions, were used in the structural evaluation of the RCC dam. Example ground motion components and Husid plots for the SRCH10-SP 10 k-year event with a peak PGA of 1.36 g are shown in Figure A1 in Appendix A.

#### **5. Risk-Informed Design Criteria**

Ferguson et al. [45] introduced the risk-informed design approach adopted for the ongoing designs of three new RCC dams that will be located in Oregon and Washington, where seismic hazards from the CSZ are critical design criteria for the layout and crosssections of the dams.

The risk-informed design approach as evolved over a series of planning level studies and has been substantially influenced by the author's and HDR's experience with Federal (including Reclamation's) standards-based design guidance as well as Risk-Informed Decision-Making (RIDM) guidance that has evolved in the United States since the late 1990s. In addition, the approach has evolved based on an understanding of (1) potential failure modes (PFMs) and related system response event trees used for quantitative risk analysis of concrete gravity dams (Reclamation/USACE Best Practices, 2019, Chapter E-3) on several large concrete dams, (2) understanding of uncertainties associated with the estimation of loading frequencies, (3) estimation of the range and corresponding uncertainty associated with material properties for the dams and foundation materials at each site, and (4) the potential for strength degradation and changes to water pressures on critical potential failure surfaces.

The general risk-informed design approach for the Scoggins RCC replacement dam (sometimes referred to as Option 3 in the project alternatives evaluation) included separate but related seismic performance criteria. RIDM takes the important step of considering consequences of dam failure that are not typically considered as part of standards-based design, except the dam's general hazard classification. Under RIDM, a high-hazard dam that could result in higher levels of life loss upon failure requires consideration of higher seismic loading. For example, a high-hazard dam that could result in the loss of 1 to 10 lives would consider seismic loading of up to 1 in the 10,000-year event. For higher life loss potential, the structure response must be evaluated for earthquakes with higher estimated recurrence intervals to assess risks fully. Based on experience performing full QRA, with either earthquake or hydrologic loading partitions that help identify that the maximum risk partition has been evaluated, it has become common to evaluate earthquakes with estimated recurrence intervals of 20,000 to 50,000 years. The maximum risk partition is typically associated with loading partitions that are less than these recurrence intervals, confirming that the design achieves an adequate level of risk against failure. For the Scoggins dam, the risk evaluation was complicated by the fact that estimated life loss consequences for the existing dam were much higher (over 100) than for the new downstream RCC dam. The new downstream dam would require the acquisition of a sawmill facility and a few residences. Once completed, the downstream dam would have estimated life loss consequences of about 10 or an order of magnitude lower. In order to compare the options, the same design criteria used for the evaluation of the option to modify the existing dam was used for the RCC dam and included the earthquakes with up to a 50,000-year return period.

Based on experience with the earlier appraisal-level structural analyses and design development, a set of risk-informed design criteria was adopted for the Scoggins RCC dam feasibility design. Results of the appraisal level design full quantitative risk analysis suggested that such criteria would result in risk estimates that would be acceptable under Reclamation's PPGs (2011). The design criteria were as follows:

*5.1. #1 Elastic Response for 500- to 1000-Year Seismic Events*


*5.2. #2 Linear-Elastic Transitioning to Possible Localized Non-Linear Response with Limited Damage Beginning to Occur between the 1000- and 5000-Year Seismic Events*


*5.3. #3 Non-Linear Response, Moderate Damage, and Post-Earthquake Stability for Events Larger Than 5000-Year Return Periods—Earthquake Events with Estimated Recurrence Intervals of up to 1 in 50,000 Years Were Evaluated*


#### *5.4. #4 Post-Seismic Stability Factor of Safety (FOS) > 1.0*

• Predicted for all loading conditions, including the damage from the 10,000- and 50,000-year events when a reasonable lower bound residual friction angle of 35 degrees is assumed for the planes of sliding, and full uplift (drains assumed inoperable) is applied linearly along the sliding plane as a full normal operating reservoir at the upstream heel of the dam and tailwater at the toe of the dam.

#### **6. Structural Model and Modeling Approach**

Structural analyses supporting the feasibility design included the development of both 2D and 3D structure finite element models. A curved gravity configuration was adopted for the feasibility design based in part on the results of appraisal design structural modeling and QRA evaluations previously described. A plan and profile view of the dam and a typical non-spillway-overflow cross-section through the dam at the maximum structural height section is shown in Figure A2 in Appendix B. A cross-section through the dam that was developed to meet the risk-informed design criteria described in Section 5 is shown on Figure A3 in Appendix B.

The 3D structural analysis model included 28 monoliths. All monoliths are 50 feet (15.24 m) wide except the monolith containing the low-level outlet facility in the lower left abutment area that has a width of 100 feet (30.48 m). Dam monoliths would be formed by transverse contraction joints (TCJs) expected to control the behavior of the dam, including the potential for cracking and movement of monoliths during an earthquake event. The 2D models of the dam were developed late in the structural analysis process to further evaluate the potential for cracking in the upper portion of the dam that were indicated by the 3D analyses results of the 50,000-year, long-duration earthquake time histories.

The 3D and later 2D analysis models were developed in LS-DYNA (version 11.0). These models can consider both linear-elastic and non-linear structural behavior (both material models and contact surfaces) and the loading conditions during which a change from linear-elastic to non-linear behavior is likely to occur. The initial 3D model included provisions for non-linear behavior using specific contact surfaces where cracking and/or sliding was expected to occur, including the dam–foundation contact and along TCJs between the dam monoliths. The type of LS-DYNA contact used included "Tiebreak with Friction" for the dam/foundation contact and "Sliding with Friction" for the TCJs. Study cases were included as part of the 3D seismic response analyses where the non-linear behavior along the dam foundation contact was prohibited so that locations within the dam could be identified where overstressing, potential cracking, and damage would begin to occur. As the analysis work continued, additional contact surfaces were added to the 2D model of a central monolith exhibiting maximum seismic response (overstressing) so that the potential for cracking in the upper portion of the dam could be evaluated.

#### *6.1. Model Descriptions including Provisions for Non-Linear Response*

The 3D structural analyses included three major components: the foundation, the reservoir, and the dam, which includes voids for the outlet works, sluiceways, and drainage gallery in the dam. The model was built in accordance with Reclamation's state-of-practice for non-linear analysis for concrete dams [46] and is shown in Figure 2. The model extends

7400 feet (2255.5 m) in the north–south (cross-canyon) direction, 6500 feet (1981.2 m) in the east–west direction (upstream-downstream), and is roughly 1000 feet (304.8 m) deep. *Water* **2023**, *15*, x FOR PEER REVIEW 9 of 44 *Water* **2023**, *15*, x FOR PEER REVIEW 9 of 44

> **Figure 2.** 3D Isometric View of Scoggins RCC Dam LSDYNA Model. Brown color represents abutment and foundation materials, blue shows limits of reservoir included in the model and the curved and colored item in middle of figure is the proposed dam. **Figure 2.** 3D Isometric View of Scoggins RCC Dam LSDYNA Model. Brown color represents abutment and foundation materials, blue shows limits of reservoir included in the model and the curved and colored item in middle of figure is the proposed dam. **Figure 2.** 3D Isometric View of Scoggins RCC Dam LSDYNA Model. Brown color represents abutment and foundation materials, blue shows limits of reservoir included in the model and the curved and colored item in middle of figure is the proposed dam.

> A plan and exploded isometric view looking at the downstream face of the dam in the model is shown in Figure 3. A plan and exploded isometric view looking at the downstream face of the dam in the model is shown in Figure 3. A plan and exploded isometric view looking at the downstream face of the dam in the model is shown in Figure 3.

and outer profile view colors represent different monoliths in the dam. Inner color shown on profile represents the inner zone of RCC materials with different material properties. A cross-section of the 2D model established through monolith 15 is shown in Figure 4. **Figure 3.** Scoggins RCC Dam LSDYNA Model: (**a**) plan view; (**b**) exploded isometric view. Plan and outer profile view colors represent different monoliths in the dam. Inner color shown on profile represents the inner zone of RCC materials with different material properties. **Figure 3.** Scoggins RCC Dam LSDYNA Model: (**a**) plan view; (**b**) exploded isometric view. Plan and outer profile view colors represent different monoliths in the dam. Inner color shown on profile represents the inner zone of RCC materials with different material properties.

The upper portion of Figure 4a shows a general view, including the maximum non-

A cross-section of the 2D model established through monolith 15 is shown in Figure 4. The upper portion of Figure 4a shows a general view, including the maximum non-

A cross-section of the 2D model established through monolith 15 is shown in Figure 4. The upper portion of Figure 4a shows a general view, including the maximum non-overflow section, foundation, and reservoir, and Figure 4b shows a close-up view of the cross-section showing locations where contact elements were included in the 2D model at the damfoundation contact and in the upper portion of the dam at the base of the chimney section. *Water* **2023**, *15*, x FOR PEER REVIEW 10 of 44 overflow section, foundation, and reservoir, and Figure 4b shows a close-up view of the cross-section showing locations where contact elements were included in the 2D model at

**Figure 4.** 2D Model through monolith 15 showing: (**a**) dam, foundation, and reservoir; (**b**) close-up of cross-section showing contact elements at rock (foundation)—concrete interface, potential inclined crack, and lift joint crack at the base of the chimney section. The brown elements represent the foundation bedrock, blue represents the reservoir and the four colors in the dam cross-section represent the different parts of the dam produced by the contact surfaces with the model (labeled 1, 2 and 3). The foundation in the 3D model was represented by a combination of 200-foot by **Figure 4.** 2D Model through monolith 15 showing: (**a**) dam, foundation, and reservoir; (**b**) closeup of cross-section showing contact elements at rock (foundation)—concrete interface, potential inclined crack, and lift joint crack at the base of the chimney section. The brown elements represent the foundation bedrock, blue represents the reservoir and the four colors in the dam cross-section represent the different parts of the dam produced by the contact surfaces with the model (labeled 1, 2 and 3).

200-foot by 40-foot-deep (61 m by 61 m by 12.2 m) coarser mesh blocks except in the immediate vicinity of the dam where 200-foot (cross-canyon) by 50-foot (upstream-downstream) by 40-foot-deep (61 m by 15.24 m by 12.2 m) blocks were used to facilitation establishment of the dam-foundation contact. The reservoir was simulated at the normal maximum operating level of elevation, The foundation in the 3D model was represented by a combination of 200-foot by 200-foot by 40-foot-deep (61 m by 61 m by 12.2 m) coarser mesh blocks except in the immediate vicinity of the dam where 200-foot (cross-canyon) by 50-foot (upstream-downstream) by 40-foot-deep (61 m by 15.24 m by 12.2 m) blocks were used to facilitation establishment of the dam-foundation contact.

303.5 feet (92.5 m). The reservoir geometry was dictated by the planned maximum reservoir level, dam limits, and foundation geometry. For short-duration earthquakes, the reservoir was modeled using the LS-DYNA elastic fluid material model and reduced integrated brick elements with an hourglass Type 2 that yielded the lowest undesirable hourglass energy and allowed the water to flow with sliding of the dam and without loss of the reservoir force at the end of the earthquake. Significant problems with the elastic fluid model elements were experienced for long-duration earthquakes. Attempts were made to resolve the issue with the longer-duration earthquakes related to element distortion, including remeshing and increasing the stiffness of reservoir elements near the bank of the reservoir. Subsequently, in consultation with Reclamation modeling experts, it was agreed that the reservoir portion of the model would be changed to a linear elastic material The reservoir was simulated at the normal maximum operating level of elevation, 303.5 feet (92.5 m). The reservoir geometry was dictated by the planned maximum reservoir level, dam limits, and foundation geometry. For short-duration earthquakes, the reservoir was modeled using the LS-DYNA elastic fluid material model and reduced integrated brick elements with an hourglass Type 2 that yielded the lowest undesirable hourglass energy and allowed the water to flow with sliding of the dam and without loss of the reservoir force at the end of the earthquake. Significant problems with the elastic fluid model elements were experienced for long-duration earthquakes. Attempts were made to resolve the issue with the longer-duration earthquakes related to element distortion, including remeshing and increasing the stiffness of reservoir elements near the bank of the

reservoir. Subsequently, in consultation with Reclamation modeling experts, it was agreed that the reservoir portion of the model would be changed to a linear elastic material with a Poisson's ratio of 0.4999 and a modulus of elasticity of 189.7 psi (pound-force per square inch) (1.31 MPa) consistent with Reclamation guidance [46]. With changes, the model for all long-duration earthquake loadings was run until a time of about 142 s when runs stopped due to insufficient memory. A restart run for a selected case was completed and verified that simulations up to 142 s were capturing the maximum displacement response of the dam consistent with the Arias intensity variation with time.

Contact surfaces in the 3D model were used to evaluate the interaction between the dam monoliths, the dam monoliths and foundation, the dam and the reservoir, and the reservoir and foundation. All contacts were defined per the Reclamation's State-of-Practice guidelines [46].

A global damping value (or Rayleigh) damping equal to 1.8 was used in the analysis during the application of static loads, including the maximum reservoir loading. Once the initial static loading condition was completed, the static loading prescribed foundation boundary conditions were replaced with a non-reflecting boundary and static reaction forces. The three-component earthquake traction loadings were applied at the base of the model, and the mass damping was reduced to near zero (0.001) and remained at this level during the application of seismic loads. Energy dissipation was limited to that which would occur at the non-linear contact surfaces and foundation and reservoir radiation damping. The sufficiency of this approach was evaluated and verified in a test model that applied an impulse load during an intermediate stage of loading. The natural period and damping of the dam were estimated at 0.33 s and 6.6%, respectively. Further, the application of increasing load in the last stage of the test model verified that both overturning and sliding failure at the base of the dam could be captured by the model.

#### *6.2. Material Properties*

The foundation properties used in the 3D and 2D models were estimated characteristics values based on the evaluation of the results of three phases of site characterization and expert judgment. A summary of the characteristic properties developed from the results of lab testing, downhole and surface geophysical testing, and rock mass classification of rock core samples is provided in Table 2.

**Material Properties Assigned Value** Average Bedrock Density (y) 145 lb/ft<sup>3</sup> (6.94 kpa) Rock Mass Deformation Modulus (E) 1.00 <sup>×</sup> <sup>10</sup><sup>6</sup> lb/in<sup>2</sup> (6895 MPa) Poisson's Ratio (v) 0.32 Shear Wave Velocity (Vs) 3500 ft/s (1067 m/s) P-Wave Velocity (Vp) 6800 ft/s (2073 m/s)

**Table 2.** Foundation material properties.

Note(s): ft/s = feet per second, lb/ft<sup>3</sup> = pounds per cubic feet, lb/in<sup>2</sup> = pounds per square inch, m/s = meters per second, MPa = megapascals, kpa = kilopascals.

The target unconfined compressive strength selected for the feasibility-level structural analyses of the dam cross-section was based on results of appraisal-level structural analysis results as well as feasibility-level RCC mix design studies of both an on-site sandstone aggregate source and an off-site quarry source. Both the on-site and off-site sources were found to provide both coarse and fine aggregate materials that would meet acceptable quality requirements for the RCC. Because of the uncharacteristic cross-section properties, including the sloping upstream face and relatively flat downstream slope of the dam, and considering the suitability of the on-site sandstones for use in the construction of the dam, a zoned cross-section was selected, allowing for materials having different specified strengths and an expected overall construction cost reduction. The inside of the dam would be constructed of material with lower specified strength than the outer surfaces of the dam.

To inform the risk analysis of the dam, analyses were performed using the target strengths of the RCC at one year of age as well as the anticipated long-term strength of these materials (greater than 10 years of age). The following 1-year design strengths were considered in the development of engineering properties used in the structural analyses:


The average concrete density from the test mixes for the inner and outer zones along with structural concrete for the outlet works, are summarized in Table 3. The modulus of elasticity of concrete material is dependent on age, strength, and aggregate types. The modulus values selected for the structural analyses based on the target one-year unconfined compressive strength targets are summarized in Table 4. Estimated Tensile strengths along with the estimated shear strength of the target RCC materials, are summarized in Tables 5 and 6.

**Table 3.** Density of concrete for monoliths and structures.


Note(s): N/A = not applicable, SS = blend of sandstone coarse and fine aggregate, SB = blend of sandstone coarse and basalt fine aggregate, BB = blend of basalt coarse and fine aggregate.


**Table 4.** Modulus of elasticity and Poisson's ratio for the monoliths.

**Table 5.** Summary of estimated tensile strength of RCC materials for structural analyses and evaluation of modeling results.


Note(s): <sup>1</sup> 80% applied to intact RCC strength to account for lift joints. <sup>2</sup> 150% of the static tensile strength adjusted for lift joints.


**Table 6.** Summary of estimated shear strength of RCC materials for structural analysis and evaluation of modeling results.

Note(s): deg. = degrees.

RCC materials continue to gain strength over time resulting in higher compressive and tensile strength capacity. To inform the risk analyses for the feasibility design configuration, several study cases were evaluated with what were judged to be reasonable "long-term" strength properties (properties after about 10 years of age), including:


The material properties of the RCC materials were developed not only from the mix design studies but from published sources. Splitting and direct tensile, as well as shear strength testing, was not performed as part of the mix design studies; rather, the assignment of those strengths from published sources was judged to be sufficient for the feasibility design. Such testing will be considered during the subsequent final design work if structural analyses indicate that it is warranted.

A summary of the contact surface properties used for each contact surface in the 3D model is summarized in Table 7.


**Table 7.** Contact Surfaces used in the 3D Finite Element Model.

Note(s): <sup>1</sup> This is a conservative but commonly used value for TCJs in structural modeling. <sup>2</sup> The foundation contact strength neglects the expected rough contact found during construction. <sup>3</sup> 0.69 to 1.72 MPa. <sup>4</sup> TCJs = Transverse Contraction Joints.

#### *6.3. Cracking Potential in the Upper Portion of the Dam*

The seismic loading potential at the Scoggins site is extraordinary. The seismic loading being considered for the feasibility design is unprecedented for any concrete dam in the world (Weiland, 2021, personal communication [47]). Consequently, it was important to consider the potential for cracking and adverse response at other locations within the dam as part of the quantitative risk analysis of the feasibility design. For purposes of feasibility design, this hazard was judged to be in the upper portion of the dam (base of the chimney section), consistent with noted cracking case histories such as Hsingfengkiang Dam, China (1962); Koyna dam, India (1967); and Sefed-Rud Dam, Iran (1990). The use of non-linear material models to inform potential cracking locations in the dam was considered, but it was decided to defer those analyses to a later stage of project development.

The two additional contact surfaces were added to the upper portion of the 2D model of monolith 15 to evaluate the potential preference for cracking to develop. In general, except as noted below, the contact surfaces were oriented to be consistent with crack formation at Koyna dam but adapted for RCC construction methods. The horizontal surface labeled as 3 in Figure 4 corresponds to the typical orientation of one-foot-thick horizontal lift surfaces and generally aligns with the maximum principal stresses (Z-stresses) observed near the upstream face and into the central portion of the dam cross-section. The inclined contact shown as contact surface 2 in Figure 4 was intended to represent cracking that could initiate from the downstream face of the dam. While this surface was extended through the entire section of the dam, the likelihood of such a crack forming through the entire section was judged to be extremely low (see discussion below). It was included to fully understand and describe the potential behavior of the upper portion of the dam, particularly the potential for upstream movement of the curved gravity section.

Cracking initiates and develops in response to the orientation of the major principal stresses. Results of both the 3D and 2D structural analysis models show that the major principal stresses orient parallel to the upstream and downstream faces of the dam. For larger earthquakes, the major stresses begin to cycle between tension and compression. It is tensile stresses resulting from the movement of the dam crest in either the upstream or downstream directions that cause the initiation and progression of cracking.

While there is a potential for both horizontal and inclined crack surfaces to develop in the upper portions of conventional concrete dams, the method of RCC dam construction suggests a predominant likelihood for horizontal cracking along lift surfaces verses the development of inclined cracks across the RCC layers through intact materials. If a crack initiates through intact concrete materials in various lifts due to principal stress orientations (e.g., cracks initiating from the downstream face of the dam), it is likely that the crack formation would be influenced by lift surface properties as the crack propagates, and the crack would likely take on a stair-stepped and eventually a bi-linear configuration toward the center of the section where principal stress orientation begins to provide a preference for the development of tension perpendicular to the lift surfaces. In fact, cracking from the upstream face of the dam would likely initiate at a lifting surface and follow that surface until the crack intersects any cracks developing from the downstream face. With strong enough and long enough shaking, a continuous crack with an approximate bi-linear configuration could develop through the cross-section. Such a bi-linear configuration is shown at the bottom of Figure 4. The possible bi-linear crack forms the bottom of the block labeled as Part 4, shown in the light tan color.

Both Chopra and Chakrabarti (1973) [48] and Nuss et al. (2012) [49] note that the cracking at the Koyna dam that occurred at the base of the dam's chimney section was horizontal from both the upstream and downstream faces. Physical (Mridha and Maity, 2014 [50]) and numerical models of crack propagation with SFEM methods (Wang et al., 2015) [51] indicate a preference for bi-linear crack configuration.

An additional consideration in the evaluation of potential cracking-related failure modes in the upper portion of the dam cross-section would be the contribution of side forces on moveable blocks. The TCJs are very rough, and any movement upstream or downstream would have to overcome some element of roughness along the TCJs. Movement in the downstream direction would likely encounter increasing resistance to deformation as the radial monolith joints engage and arch action is mobilized. For blocks moving in the upstream direction of the curved gravity configuration of the dam, the TCJs would likely offer an initial significant amount of side resistance. However, if enough movement occurs, the TCJs could begin to open with a reduction and possible full elimination of side resistance unless enhanced roughness was introduced in the TCJ construction.

#### *6.4. Model Setup and Calibration*

Static and dynamic loads were applied in LS-DYNA using multiple load curves. Gravity and uplift loads were applied on the same load curve, ramped up from zero to full

force at 2 s, given 1 s of quiet time before earthquake loads began at 3 s into the model runs. Full gravity and uplift loads were held constant until the end of the runs.

Gravity was applied to all the dam and reservoir elements to achieve a steady static stress state before seismic loads were applied. This avoided any unfavorable behavior in the foundation and unrealistic foundation settlement. Hydrostatic loads were generated against the upstream face of the dam as the gravity load of the reservoir was applied. Uplift pressures were applied along opposite faces of the foundation contact surface using element face pressures and a load curve. Water pressures along the contact could not be implemented from the reservoir block in the model. Silt loads were not considered in the analysis.

For the feasibility analyses, construction sequencing and thermal analyses were not performed as it was anticipated that initial stresses within the dam would not control the dam's behavior during seismic loading. To confirm that rigorous modeling of initial stress conditions was not required, an evaluation of three alternative normal loading models (application of self-weight and normal reservoir hydrostatic pressures relative to the activation of vertical contact elements between the monoliths) was completed. These analyses confirmed that the time when activation of vertical contact between the dam monoliths occurred did not significantly influence initial stresses and the estimated seismic response along the contact at the base of the dam or the vertical monolith joints.

Full hydrostatic pressure was assumed at the upstream heel of the dam, and tailwater pressure was assumed at the downstream toe for all structural analyses. The uplift pressure varied linearly from full head to tailwater head when a gallery was not present in the monolith (upper abutment locations). At locations where a gallery was present, a drain efficiency of 65% was used for normal and earthquake loading conditions. Uplift pressure values at nodal contact points were calculated using an excel spreadsheet, and these pressures were applied as non-uniform pressure on the slave and master surfaces. For the flood loading (reservoir elevation 312 feet) and post-earthquake loading conditions, a 0% drain efficiency was assumed.

The LS-DYNA dynamic analysis of the foundation-dam-reservoir system used the traction time history at the base of the model as input. In accordance with Reclamation guidelines for non-linear analysis of concrete dams [46], a non-reflecting foundation boundary at the sides and base of the foundation materials with a shear velocity of 1000 m per second was applied. Through an iterative deconvolution process, an estimated factor of 0.55 was applied to the input ground motions provided by Reclamation based on estimated foundation stiffness properties to reproduce the target input motions and site response spectrum at the foundation surface (base of the dam) beneath the maximum section location. Deconvolution was verified through analyses with two additional foundation models.

All production models for the structural analyses were mainly different in material parameters and earthquake input motions. For all production models, the static loading, including the weight of the dam and reservoir water (but not the foundation weight) as well as uplift pressures, were applied during the initial time step of zero to 2 s. This was followed by a silent time of one second after the earthquake traction load was then applied at the base of the model. The earthquake load was continued until a time sufficient to capture the maximum responses of the dam. For the long-duration earthquakes, it was found that the maximum permanent displacements and stresses were captured within the first 142 s of the time history. For the short-duration earthquakes, maximum permanent displacements were captured within the first 20 to 25 s.

#### *6.5. Model Study Cases*

A total of 20 study cases were outlined for seismic response analysis with the 3D model. The study cases are listed in Table A1 in Appendix C. Eighteen of these study cases were completed. The first three study cases represented a range of planned static analyses that were subsequently combined into a single analysis. Because the 3D model did not provide a direct indication of the stability FOS, supplemental 2D analyses (excluding consideration of arch stresses that may have developed) of the full range of monolith heights were completed.

As can be seen in Table A1, the study cases evaluated static loading conditions of the maximum reservoir level and ground motions from three earthquake recurrence intervals, including 5000-, 10,000-, and 50,000-year events. Short-duration and long-duration events were completed for each of the recurrence intervals, and base case RCC strength (one-year target strengths), as well as long-term RCC strengths, were also considered. Post-earthquake analyses were completed assuming degraded strengths of φ = 35◦ and c = 100 psi [0.69 MPa] before full crack development and φ = 35◦ and c = 0 after crack development was completed to bound the estimates of monolith deformations (maximum possible under worst-case assumptions) and to verify that movement of the monoliths stopped at the end of the earthquake loading. Finally, uplift loads on the dam/foundation contact were considered, including full functionality (65% reduction) during the earthquake, as well as ranging from fully functional to non-functional following the earthquake (post-earthquake study cases).

Additional 2D study cases were added during a later stage of the structural modeling work. This 2D model and related study cases were designed to inform the potential for cracking and displacements in the upper portions of the dam during large and longduration earthquake events. A total of 10 study cases with various combinations of contact restraint, material strengths along the contact surfaces, and earthquake loads were completed. The study cases are summarized in Table A2 in Appendix C. The 2D modeling was used in lieu of using the large 3D model in the interest of time. Similar to using non-linear material models, the use of the large 3D model for evaluation of cracking potential in the upper portion of the dam was deferred to a later stage in the design process when additional time and resources would be available, and the results of the initial discrete crack and 2D results (including applied radial shear resistance as described below) could be used to inform the requirements for more rigorous computational approaches.

As was expected, deformations predicted with the initial 2D modeling approach were substantially higher than predicted by the 3D model for similar strength assumptions and earthquake loads. This difference was attributed primarily to the inability of the 2D model to incorporate shear strength along the TCJs, and the development of arch stresses and transfer of arch loads that will substantially influence the performance of the dam, particularly downstream displacements that engage the vertical monoliths joints and initiate arch load transfer. Subsequently, the 2D model was modified to include components of radial shear resistance on the TCJs based on the 3D modeling analysis results. The components of radial shear resistance were conservatively estimated and then incorporated into the 2D model (lower than the actual 3D model indicates will occur with downstream movements). With this change, the total deformations estimated by the 2D model were similar to the 3D model estimates.

The 2D study cases ranged from full linear elastic with all contact surfaces tied (Case 1 with bond strength assumed as 1 <sup>×</sup> <sup>10</sup><sup>20</sup> psi [6.9 <sup>×</sup> <sup>10</sup><sup>17</sup> MPa]) to various combinations of tied base, lift surface, and inclined contact surfaces and combinations of untied contact tensile and shear strength assumptions. Study cases 3, 4, and 5 considered all contact surfaces untied with both best estimate and long-term tensile and shear strength assumptions along the lift surfaces or, in the case of the inclined contact, intact tensile and shear strength. Case 10 allowed only the inclined contact surface to crack. As previously noted, this case was judged to be extremely unlikely to occur but run to inform the risk estimators.

#### **7. Model Results**

#### *7.1. Static Analyses*

Overall, the evaluation of model contact forces and gravity analyses indicated that no nonlinearity (sliding) would occur at the dam-to-foundation interface. All monoliths would have a sufficient sliding FOS (>1.5), assuming a sliding friction angel of 45 degrees, zero cohesion, and 65% foundation drain efficiency. The minimum FOS values occur under the maximum sections of the dam with a base elevation of 130 feet (39.6 m) or less. Further, all monoliths would have a FOS greater than 1.2 for post-earthquake conditions assuming that the sliding friction angle degraded to 35 degrees, foundation drains were not functional under the dam, and full reservoir uplift pressure was applied under the entire base of the monolith.

Static analyses were also evaluated for normal reservoir operations and a reservoir elevation of 310.5 feet (94.9 m; PMF loading), assuming that a large enough earthquake had occurred at the site to cause cracking through the entire base of the dam and enough sliding to result in some strength degradation and loss of foundation drain function. The results of these analyses are summarized in Table 8. Sliding FOS for friction angles of 45 and 35 degrees in combination with a range of selected cohesion values are also shown in this table. These strength combinations indicated sliding stability FOS values equal to 3.0 and 4.0, while for the PMF loading, the FOS values are equal to 2.0 and 2.7.

**Table 8.** Summary of static gravity analyses for increased uplift and reduced contact strength assumptions.


Figure 5 shows a summary of the 2D gravity analysis results for sliding FOS for a normal maximum reservoir elevation (303.5 feet [92.5 m]) and a very large flood (pool elevation to 312 feet [95.1 m]), a base friction angle of 45 degrees and the full range of monolith heights along the axis of the dam. The sliding FOS values increase at higher elevations supporting the previous assertion that the minimum FOS values are beneath the maximum sections of the dam. Subjected to static loading, the proposed dam has tremendous reserve strength if arch action is ever mobilized, demonstrating the benefit of curing the dam.

**Figure 5.** Graphical depiction of gravity analysis results for static loading sliding factor of safety (base friction of 45 degrees). **Figure 5.** Graphical depiction of gravity analysis results for static loading sliding factor of safety (base friction of 45 degrees).

The LS-DYNA dynamic analyses indicate that once sliding initiates, the net resultant of forces along the vertical contacts of the maximum height monoliths near the center of the dam mobilizes a resultant upstream–downstream horizontal force. For a one-footthick section of a dam, the average equivalent shear stress of 0.15 psi (0.001 MPa) acts on the vertical sides. A post-earthquake gravity analysis considering half of this estimated side shear stress, a residual friction angle of 35 degrees with no cohesion, and a uniform reservoir pressure along the entire length of the cracked base indicates that maximum height monoliths of the dam have a minimum FOS of 1.2. The LS-DYNA dynamic analyses indicate that once sliding initiates, the net resultant of forces along the vertical contacts of the maximum height monoliths near the center of the dam mobilizes a resultant upstream–downstream horizontal force. For a one-footthick section of a dam, the average equivalent shear stress of 0.15 psi (0.001 MPa) acts on the vertical sides. A post-earthquake gravity analysis considering half of this estimated side shear stress, a residual friction angle of 35 degrees with no cohesion, and a uniform reservoir pressure along the entire length of the cracked base indicates that maximum height monoliths of the dam have a minimum FOS of 1.2.

#### *7.2. 3D Seismic Response Analyses 7.2. 3D Seismic Response Analyses*

Based on work completed during the appraisal design phase, it was clearly demonstrated that the plan and cross-section requirements for the dam were going to be controlled by seismic loading response. The study cases were run and evaluated to verify that the proposed cross-section would meet the risk-informed design criteria and have failure risks that would meet Reclamation's Public Protection Guidelines (2011) [52]. The following critical PFMs for the dam were identified as driving the design and risk estimates: Based on work completed during the appraisal design phase, it was clearly demonstrated that the plan and cross-section requirements for the dam were going to be controlled by seismic loading response. The study cases were run and evaluated to verify that the proposed cross-section would meet the risk-informed design criteria and have failure risks that would meet Reclamation's Public Protection Guidelines (2011) [52]. The following critical PFMs for the dam were identified as driving the design and risk estimates:

1. Sliding of the dam at the dam to foundation interface or along a lower lift (just above the foundation contact) that yields to a large displacement between adjacent monoliths or instability of one or more monoliths; 1. Sliding of the dam at the dam to foundation interface or along a lower lift (just above the foundation contact) that yields to a large displacement between adjacent monoliths or instability of one or more monoliths;

2. Overstressing during the earthquake leading to cracking and sliding in the upper part of the dam with significant degradation (rubbilizing) of the concrete along vertical monolith joints. The study case analyses summarized in Tables A1 and A2 (Appendix C) formed the basis for assessing these PFMs and completing a full QRA of the proposed dam. Modeling results for different combinations of earthquake loadings and dam-to-foundation inter-

2. Overstressing during the earthquake leading to cracking and sliding in the upper part of the dam with significant degradation (rubbilizing) of the concrete along ver-

*Water* **2023**, *15*, x FOR PEER REVIEW 19 of 44

tical monolith joints.

The study case analyses summarized in Tables A1 and A2 (Appendix C) formed the basis for assessing these PFMs and completing a full QRA of the proposed dam. Modeling results for different combinations of earthquake loadings and dam-to-foundation interface strength were carefully reviewed to evaluate stress time histories and estimated monolith displacements, identifying critical time steps and locations with the model corresponding to the maximum response of the dam. The analyses were further reviewed at those specific time snapshots and locations when the structure displacement and stress responses were the highest values. face strength were carefully reviewed to evaluate stress time histories and estimated monolith displacements, identifying critical time steps and locations with the model corresponding to the maximum response of the dam. The analyses were further reviewed at those specific time snapshots and locations when the structure displacement and stress responses were the highest values. A set of common points along the crest of the dam were used to plot and then select the range of estimated displacements. An example of a deformed shape of the dam fol-

A set of common points along the crest of the dam were used to plot and then select the range of estimated displacements. An example of a deformed shape of the dam following a 5000-year, short-duration (MYG008) earthquake event and assumed 1-year material properties for the RCC is shown in Figure 6. The representative crest nodal points (A through H) established for the comparative evaluation of crest displacement are also shown in this figure. The black dots represent the original location of the nodal points relative to the deformed shape of the dam. Nodes A–C, D–F, and G–I are left abutment, central valley, and right abutment response locations, respectively. lowing a 5000-year, short-duration (MYG008) earthquake event and assumed 1-year material properties for the RCC is shown in Figure 6. The representative crest nodal points (A through H) established for the comparative evaluation of crest displacement are also shown in this figure. The black dots represent the original location of the nodal points relative to the deformed shape of the dam. Nodes A-C, D-F, and G-I are left abutment, central valley, and right abutment response locations, respectively.

**Figure 6.** Plan view of dam showing deformed shape (magnification factor of 300) following the 5000-year MYG008 Earthquake. One-year strength parameters. Different colors represent different monoliths within the dam. Letters A through I represent the nodal displacement plot locations **Figure 6.** Plan view of dam showing deformed shape (magnification factor of 300) following the 5000-year MYG008 Earthquake. One-year strength parameters. Different colors represent different monoliths within the dam. Letters A through I represent the nodal displacement plot locations shown in Figure 7. Corresponding nodal numbers used to obtain displacement plots from model are listed adjacent to the letters A through I.

shown in Figure 7. Corresponding nodal numbers used to obtain displacement plots from model are listed adjacent to the letters A through I. An example of a plot of the estimated displacement for each of the A through I nodal points is shown in Figure 7. The displacements are total absolute displacements of the monoliths above the foundation contact. Relative displacements of the monoliths are estimated by comparing the total displacements of adjacent monoliths. The accumulated nodal point displacements are an indication of the monoliths that cracked through the base (A through G) verses those that did not crack (H and I) for the assumed material properties. For example, Nodes A through G show some accumulated displacements at the end of the earthquake ranging from a little to less than 1 inch (<2.54 cm) for Node G to An example of a plot of the estimated displacement for each of the A through I nodal points is shown in Figure 7. The displacements are total absolute displacements of the monoliths above the foundation contact. Relative displacements of the monoliths are estimated by comparing the total displacements of adjacent monoliths. The accumulated nodal point displacements are an indication of the monoliths that cracked through the base (A through G) verses those that did not crack (H and I) for the assumed material properties. For example, Nodes A through G show some accumulated displacements at the end of the earthquake ranging from a little to less than 1 inch (<2.54 cm) for Node G to as much as 4 inches (10.16 cm) for Node B. Cracking through the base of the dam would have occurred in all monoliths represented by these nodes for displacements to have developed. However, the accumulated displacement of nodes H and I in the upper right abutment was zero inches indicating that cracking through the base of the monolith at those locations likely did not occur.

as much as 4 inches (10.16 cm) for Node B. Cracking through the base of the dam would have occurred in all monoliths represented by these nodes for displacements to have developed. However, the accumulated displacement of nodes H and I in the upper right abutment was zero inches indicating that cracking through the base of the monolith at

those locations likely did not occur.

**Figure 7.** Plot of crest node displacement for the 5000-year, short-duration earthquake (MYG008) and 1-year strength parameters. (x-displacement is in inches, and time scale is in seconds. On the vertical axis, positive displacement is movement downstream, and negative (−) is movement up-**Figure 7.** Plot of crest node displacement for the 5000-year, short-duration earthquake (MYG008) and 1-year strength parameters. (x-displacement is in inches, and time scale is in seconds. On the vertical axis, positive displacement is movement downstream, and negative (−) is movement upstream).

stream). Tables A3 and A4 in Appendix C contain summaries of the estimated displacements for the base case and long-term strength assumptions for the MYG008 Short Duration, and SRCH10 Long Duration (CSZ) ground motions, respectively. Displacement estimates are also shown for both the post-earthquake 10,000-year and 50,000-year events when the dam to foundation joint shear strength of only ϕ = 35° and c = 100 psi (0.69 MPa) uncracked) and ϕ = 35° and c = 0 (cracked base) were used in the model. As previously noted, these values were provided to the risk estimating team to help bind the estimated displacement relative to the shear strength assumptions. As can be seen, cracking and sliding of some monoliths occurs for the 5000-year earthquake events assuming the 1-year tensile and shear strength properties for the RCC. However, no cracking and sliding of any monoliths occurred for the 5000-year and 10,000-year earthquakes and only occurred for the 50,000 year earthquakes when the long-term strengths were used. Hence, the feasibility design Tables A3 and A4 in Appendix C contain summaries of the estimated displacements for the base case and long-term strength assumptions for the MYG008 Short Duration, and SRCH10 Long Duration (CSZ) ground motions, respectively. Displacement estimates are also shown for both the post-earthquake 10,000-year and 50,000-year events when the dam to foundation joint shear strength of only φ = 35◦ and c = 100 psi (0.69 MPa) uncracked) and φ = 35◦ and c = 0 (cracked base) were used in the model. As previously noted, these values were provided to the risk estimating team to help bind the estimated displacement relative to the shear strength assumptions. As can be seen, cracking and sliding of some monoliths occurs for the 5000-year earthquake events assuming the 1-year tensile and shear strength properties for the RCC. However, no cracking and sliding of any monoliths occurred for the 5000-year and 10,000-year earthquakes and only occurred for the 50,000-year earthquakes when the long-term strengths were used. Hence, the feasibility design plan and cross-section of the dam are robust and linear-elastic performance would be expected for events equal to or greater than the 10,000-year earthquakes as the materials in the dam strengthen over time.

plan and cross-section of the dam are robust and linear-elastic performance would be expected for events equal to or greater than the 10,000-year earthquakes as the materials in the dam strengthen over time. LS-DYNA has an important shear strength limitation that makes the estimated displacement results conservative. Specifically, once the dam cracks through the contact surface, the strength can only be simulated with a friction angle and zero cohesion. The estimated shear strength of the contact should consider some component of cohesion or increased friction upon cracking but prior to any significant sliding when roughness or asperities along the cracked surface make and important contribution to the shear strength. Degradation of the shear strength down to a value of 35 degrees occurs as deformations LS-DYNA has an important shear strength limitation that makes the estimated displacement results conservative. Specifically, once the dam cracks through the contact surface, the strength can only be simulated with a friction angle and zero cohesion. The estimated shear strength of the contact should consider some component of cohesion or increased friction upon cracking but prior to any significant sliding when roughness or asperities along the cracked surface make and important contribution to the shear strength. Degradation of the shear strength down to a value of 35 degrees occurs as deformations along the crack surface take place. Accounting for the process of strength degradation from an initial value of 55 degrees down to 35 degrees would more realistically represent actual displacements that would be smaller than those provided in Tables A3 and A4 when strength degradation is not accounted for and once the section cracks, the shear strength is immediately reduced to 35 degrees.

along the crack surface take place. Accounting for the process of strength degradation from an initial value of 55 degrees down to 35 degrees would more realistically represent actual displacements that would be smaller than those provided in Tables A3 and A4

A comprehensive summary of the estimated base cracking, representative maximum

tensile stress conditions in the monoliths, estimated maximum tensile stresses that occurred in the RCC adjacent to the monolith TCJs (vertical joints), and the maximum estimated differential movement between two monoliths is presented in Tables A5 and A6 in

strength is immediately reduced to 35 degrees.

A comprehensive summary of the estimated base cracking, representative maximum tensile stress conditions in the monoliths, estimated maximum tensile stresses that occurred in the RCC adjacent to the monolith TCJs (vertical joints), and the maximum estimated differential movement between two monoliths is presented in Tables A5 and A6 in Appendix C for the base case and long-term strength assumptions and for the MYG008 Short Duration, and SRCH10 Long Duration (CSZ) ground motions, respectively. These tables include a set of columns in the center that indicate the monolith numbers where the maximum estimated tensile stress excursions occurred, the tensile stress excursion range, the number of excursions indicated by the model results, and the estimated damage conditions that would develop. The estimated damage condition is a qualitative description assigned to guide the risk assessment on the type (if any) of seepage or discharge response that might occur. Figure 8 generally illustrates how the values in these tables were developed for each of the study cases for the same short-duration 5000-year earthquake (MYG008). The location of the various nodes plotted in Figure 8 is along the upstream face of Monolith 15, as shown in Figure 9. The figure shows the maximum stress response (tensile stress of about 250 psi) occurs at about 12 s for this combination of earthquake and material properties that occurred at node J along the upstream face of monolith 15 just below the chimney section. *Water* **2023**, *15*, x FOR PEER REVIEW 21 of 44 Appendix C for the base case and long-term strength assumptions and for the MYG008 Short Duration, and SRCH10 Long Duration (CSZ) ground motions, respectively. These tables include a set of columns in the center that indicate the monolith numbers where the maximum estimated tensile stress excursions occurred, the tensile stress excursion range, the number of excursions indicated by the model results, and the estimated damage conditions that would develop. The estimated damage condition is a qualitative description assigned to guide the risk assessment on the type (if any) of seepage or discharge response that might occur. Figure 8 generally illustrates how the values in these tables were developed for each of the study cases for the same short-duration 5000-year earthquake (MYG008). The location of the various nodes plotted in Figure 8 is along the upstream face of Monolith 15, as shown in Figure 9. The figure shows the maximum stress response (tensile stress of about 250 psi) occurs at about 12 seconds for this combination of earthquake and material properties that occurred at node J along the upstream face of monolith 15 just below the chimney section.

**Figure 8.** Example stress response of various nodes along the upstream face of monolith 15 during the 5000-year Short Duration MYG008 Earthquake. Max\_principal\_stress shown on vertical axis is in psi (145 psi = 1 MPa). Note that positive stress on the vertical axis is tension, and negative stress on this axis is compression. Units of the horizontal scale are seconds. **Figure 8.** Example stress response of various nodes along the upstream face of monolith 15 during the 5000-year Short Duration MYG008 Earthquake. Max\_principal\_stress shown on vertical axis is in psi (145 psi = 1 MPa). Note that positive stress on the vertical axis is tension, and negative stress on this axis is compression. Units of the horizontal scale are seconds.

The maximum responses (tensile stresses) included in Table A5 (Appendix C) occurred in Monoliths 24 and 25 and between Monoliths 22 and 23 (near crest node A) in the left abutment area, as illustrated in Figure 10. The insert in the lower right of Figure 10 shows a typical plot of principal stresses along the downstream face of the dam in the left abutment of the dam at a monment in time of the earthquake time history when the maximum stress response occurred. The maximum responses (tensile stresses) included in Table A5 (Appendix C) occurred in Monoliths 24 and 25 and between Monoliths 22 and 23 (near crest node A) in the left abutment area, as illustrated in Figure 10. The insert in the lower right of Figure 10 shows a typical plot of principal stresses along the downstream face of the dam in the left abutment of the dam at a monment in time of the earthquake time history when the maximum stress response occurred.

*Water* **2023**, *15*, x FOR PEER REVIEW 22 of 44

**Figure 9.** Location of nodal points along the upstream face of monolith 15 corresponding to stress response plot in Figure 8. Different colored lines in the plot represent the boundaries of the monoliths formed by the TCJs in the dam model. Perspective is looking from the reservoir toward the upstream face of the dam. **Figure 9.** Location of nodal points along the upstream face of monolith 15 corresponding to stress response plot in Figure 8. Different colored lines in the plot represent the boundaries of the monoliths formed by the TCJs in the dam model. Perspective is looking from the reservoir toward the upstream face of the dam. **Figure 9.** Location of nodal points along the upstream face of monolith 15 corresponding to stress response plot in Figure 8. Different colored lines in the plot represent the boundaries of the monoliths formed by the TCJs in the dam model. Perspective is looking from the reservoir toward the upstream face of the dam.

**Figure 10.** Plot of maximum stress response in vicinity of Crest Node A during the 5000-year, shortduration earthquake. (note stress plot in lower right are tensile in psi units. 145 psi = 1 MPa). The principal stresses shown on the vertical axis are in psi. Positive numbers represent tension and negative numbers represent compression. **Figure 10.** Plot of maximum stress response in vicinity of Crest Node A during the 5000-year, shortduration earthquake. (note stress plot in lower right are tensile in psi units. 145 psi = 1 MPa). The principal stresses shown on the vertical axis are in psi. Positive numbers represent tension and negative numbers represent compression. **Figure 10.** Plot of maximum stress response in vicinity of Crest Node A during the 5000-year, shortduration earthquake. (note stress plot in lower right are tensile in psi units. 145 psi = 1 MPa). The principal stresses shown on the vertical axis are in psi. Positive numbers represent tension and negative numbers represent compression.

The estimated limits of tensile strength of the RCC parent materials, as well as bonded lift surfaces at one year, are shown in Figure 10. The estimates of the one-year and long-term RCC tensile strengths were similarly added to all stress plots at nodes selected for evaluation where maximum stress response occurred. This allowed for the identification of the number of tensile stress excursions exceeding the estimated tensile strength of The estimated limits of tensile strength of the RCC parent materials, as well as bonded lift surfaces at one year, are shown in Figure 10. The estimates of the one-year and long-term RCC tensile strengths were similarly added to all stress plots at nodes selected for evaluation where maximum stress response occurred. This allowed for the identification of the number of tensile stress excursions exceeding the estimated tensile strength of The estimated limits of tensile strength of the RCC parent materials, as well as bonded lift surfaces at one year, are shown in Figure 10. The estimates of the one-year and longterm RCC tensile strengths were similarly added to all stress plots at nodes selected for evaluation where maximum stress response occurred. This allowed for the identification of the number of tensile stress excursions exceeding the estimated tensile strength of the RCC

that occurred. These number of excursions were then used to assess cracking and concrete damage that would likely occur qualitatively.

In addition to the issues of monolith cracking, displacement, and tensile stresses that could result in cracking and damage within the dam, an overall evaluation of the structural modeling results focused on the following considerations relative to the risk-informed design criteria:


#### *7.3. Cracking in the Upper Portion of the Dam*

Stresses indicated by the 2D model results suggest that cracking will develop, and sliding will first occur at the base of the dam near the foundation contact. For the best case estimated 1-year strength of the RCC, cracking and sliding along the base of the dam for some monoliths begins to occur for the 500-year return period earthquakes. However, assuming the long-term strength properties of the RCC, only the 50,000-year event for both the short and long-duration earthquakes causes cracking and sliding to occur along the base of the dam. For the longer-duration earthquakes, the model results also indicated a potential for cracking and some sliding on a continuous crack through the dam near the base of the chimney section for earthquakes greater than the 10,000-year event when base cracking and sliding did not occur in the model. For the 10,000-year earthquake and restraint at the base of the dam (no cracking and sliding), a limited number of stress excursions occurred in the maximum height monoliths that exceeded the estimated oneyear tensile strength of the RCC. Based on the number of these excursions, as well as consideration of the long-term strength of the RCC, the excursions were judged not to be significant enough to cause cracking initiation.

Crest cracking and sliding near the chimney section of the dam were indicated for earthquake events with a 50,000-year return period with or without base sliding and regardless of the RCC material properties. This became the driving failure mode of concern for the risk analysis. The models simulating the potential for cracking and sliding at both the crest and base of the dam did not show meaningful overstressing at other zones within the cross-section. Stresses within the dam are significantly reduced when a basal crack develops at or near the dam and foundation contacts.

Based on an evaluation of the principal tensile stress orientations and magnitude from the 3D model results, as previously discussed, two possible crack surfaces were selected for simulation in the 2D model, as shown in Figure 4. Because of the slopes of both the upstream and downstream faces of the dam, the most likely orientation of the cracks at initiation would be perpendicular to the faces of the dam. Each initiating crack would slope in opposite directions toward the center of the dam. For the downstream face, the crack inclination was estimated to be dipping between 30 and 42 degrees from horizontal toward the reservoir. For the upstream face, the crack inclination was estimated to be dipping about 15 to 22 degrees from the horizontal toward the downstream toe of the dam.

The two contact surfaces were evaluated in different ways to confirm the potential for cracking and potential deformations along a continuous crack. Two of three combinations were judged as likely based on the RCC lift surface construction method, the orientation of principal stresses, and information on cracking from the Koyna dam case history and both physical and FEM model studies of the Koyna dam cracking incident:


As previously noted, one study case was completed of a crack along the continuous inclined Contact 2 surface to inform the risk analysis. The likelihood of this continuous cracked surface developing was judged to be very remote but presented to the risk team to provide a severe upper bound of potential crest block deformations. It was judged to be very remote by the structural analysis team because principal tensile stress orientations rotate through the structure to an orientation that parallels the upstream face of the dam, and there is just not enough stress of the needed orientations that would cause such a crack to develop through the entire cross-section of the dam.

Initially, the analyses with the 2D model showed significantly higher deformations than the 3D model due to the lack of monolith side forces in the 2D model. To increase confidence in the 2D model results, as previously noted, radial stresses consistent with vertical monolith joint stresses mobilized in the 3D model were added to the 2D model. Once these stresses were considered, the deformations estimated by the 2D model became similar to those estimated by the 3D model. This is illustrated by the results presented in Table 9 for the 2D Study Case 3 analyses. The first row shows the estimated base and crest displacements when no radial (TCJ) side stress was used. To obtain the relative crest displacement, the base displacement is subtracted from the crest displacement, as shown in the fourth column of the table.


**Table 9.** 2D Study Case 3, estimated permanent displacements for the MYG008-50k Earthquake using base case (one-year) strengths.

The second row in the table showed the estimated displacements when a conservative amount of side shear stress was added to the 2D model. The third row summarizes the estimated displacements of the entire monoliths in the left abutment, center valley section, and right abutment areas of the 3D model. The 2D model of Monolith 15 represents a monolith in the center valley area of the 3D model. When the side shear was added, the estimated base displacement of 12 inches (30.48 cm) from the 2D model compares with an estimated displacement of 6 inches (15.24 cm) in the center monolith from the 3D model. Further adjustment of the side shear stresses was not performed, believing that the 2D model estimates with a limited amount of side shear stress (less than actually predicted by the 3D model) were reasonable and slightly conservative when evaluating the issue of cracking and displacements that may occur in the upper portion of the dam for this loading condition.

For comparison purposes, the results of Study Case 5 with the MYG008-50k earthquake and long-term strengths are shown in Table 10. As can be seen by comparing the results in this table with Table 9, increases in shear strength with time do not impact the potential for the structure to crack through the dam section at either the base or crest but significantly reduces the displacements that would occur.

**Table 10.** 2D Study Case 5, estimated permanent displacements for the MYG008-50k Earthquake using long-term strengths.


The results of analyses of the 2D model for the long-duration SRCH10-50k earthquake with long-term strengths are summarized in Table 11. These results showed that the base sliding of the adjusted 2D model with side shear stress produced displacements that were close to the calculated sliding from the 3D model. Similar to the comparison of the longterm strength, short-duration earthquake results above, cracking through the crest and crest sliding of about half of the estimated base sliding is possible. The importance of including side friction in the 2D model to account for arch action once cracking develops is clearly indicated by the results in Table 11.


**Table 11.** Permanent displacements for the SRCH10-50k Earthquake, long-term strengths.

Overall, the 2D model results for simulations with side shear stress generally confirmed the potential for cracking in the upper portion of the dam for the 50,000-year earthquake events. Cracking through the upper portion of the dam may also occur for the 10,000-year earthquake events but only when the base case (1-year) strengths were used. Generally, crest displacements were predicted to be about one-half of base sliding displacements when the crest section cracks through the entire section. These estimates of behavior were believed to be conservative. The 3D model results indicated that the radial shear stresses along the TCJ surfaces near the dam crest are larger than the average TCJ radial shear stress used in the 2D analyses.

The 2D modeling also confirmed that the primary direction of sliding of the dam crest is downstream when cracking through the section is either along a lifting surface, or the bi-linear surface described above occurs, and the reservoir is full. For the base case (1-year) strengths crest and the long-duration earthquake (SRCH10-50k), deformations ranged from less than a foot for the bi-linear crack configuration to about 3.5 feet (1.07 m) for a continuous crack along a horizontal lift surface. Estimated crest deformations ranged from a few inches for the bi-linear crack configuration to just over 1 foot for a continuous crack along a lifting surface when long-term strengths and the long-duration (SRCH10-50k) earthquake were evaluated.

When the model was allowed to crack along both the Contact 3 and Contact 2 surfaces without any restraint, a bounding (worst-case result with a very low likelihood of occurrence) showed that the small wedge of RCC material between the Contact 2 and 3 surfaces (Part 2 on Figure 4b) in the upstream portion of the dam developed and moved upstream between 2 and 6 feet while the remainder of the upper portion of the dam (Parts 3 and 4 on Figure 4b) moved downstream along the horizontal lift surface. Because the section of the dam is 47 feet wide at the crack locations, the net deformations between the upper portion of the dam (Parts 3 and 4) and the Part 2 wedge in excess of 20 to 23 feet (6.1 to 7 m) would have to occur in order for any sort of upstream toppling failure to occur. The analysis further indicated that the crest of the dam (Parts 3 and 4) was stable following even the largest and long duration 50,000-year earthquake event. These results were considered conservative as the 2D model did not consider the effect of increasing side restraint that would occur as downstream deformations engage increasing arch resistance to sliding.

#### **8. Risk Analysis Results**

The original appraisal-level design team risk analysis for the new RCC dam configuration was conducted in 2016 and considered only seismic-related PFMs for a straight-axis gravity dam cross-section with flatter upstream and downstream slopes. The results of that risk analysis were below Reclamation's risk tolerance guidelines. However, there were a number of ideas for optimization of the dam section that could further improve the safety of the dam and increase confidence in the risk estimate results. One important idea from the appraisal-level risk analysis was to curve the gravity dam axis in order to mobilize arch action that would reduce estimated sliding displacements. It was further believed that curving the dam axis would allow for some reduction in the upstream and downstream slope requirements resulting in possible cost reductions for the construction of the dam.

The feasibility design configuration (see Appendix B) included a curved gravity plan configuration for the dam and reduced upstream and downstream slopes. For the feasibility-level design risk analysis completed in later 2019, a total of 18 PFMs, including seismic (structural and geologic hazard), static, and hydrologic, were described, discussed, and evaluated. Full quantitative risk analyses following the best practices for dam and levee safety risk analyses [52] were performed. Risks were estimated by a joint HDR-Reclamation team for three primary risk-contributing PFMs. Details of the risk analyses, including failure mode event trees and nodal analyses, are beyond the scope of this paper. However, the overall results are instructive relative to the conceptual framework for riskinformed designs presented in this paper. The total Annual Failure Probability (AFP) wacs nearly an order of magnitude below Reclamation's Public Protection Guideline value for AFP [53]. The total Annualized Life Loss (ALL) risk (1.4 <sup>×</sup> <sup>10</sup>−<sup>4</sup> ) is also nearly one order of magnitude below the guideline value for ALL. The feasibility design level risk estimates are summarized in Table 12.


**Table 12.** Summary of risk analysis results for the new Scoggins RCC Dam.

The risk results were considered conservative. The confidence was somewhat low, and uncertainties were moderate to high regarding the performance of the upper portion of the dam for the 10,000-year and 50,000-year events under PFM6. Likewise, because of the way the ground motions were developed for the short-duration and long-duration events, summing the risk from both durations for a single PFM may have overestimated the total risk. Similarly, the consequence estimated for crest failure modes would likely be far less than breach discharge resulting from the breach of a full monolith which was not accounted for in the Total ALL estimate. PFM 6 contributed 64% of the total mean AFP and 92% of the total mean ALL.

Verifying the risk of the proposed layout and cross-section of the dam was an important step in the feasibility design. The quantitative risk analysis provided not only confirmation of the safety of the new RCC dam configuration but important input that further optimization of the configuration may be possible during the final design.

#### **9. Conclusions and Recommendations**

The 3D and 2D structural models developed for the feasibility design configuration of the new Scoggins RCC dam demonstrated that the section was robust and would provide a level of safety that would be acceptable under the Reclamations Public Protection Guidelines [52]. Further, the results of the risk analysis for the feasibility design configuration validated the four-component risk-informed design criteria that were used in developing the feasibility design.

Specifically, the following was found related to the risk-informed design criteria established for the feasibility design of the dam:

#### *9.1. #1 Elastic (Linear) Response for 500- to 1000-Year Seismic Events*


#### Structural Analysis Results

The response of the structure is related to the strength properties of the RCC. The RCC properties will increase with time, and two study cases were evaluated, including (1) base material properties (1-year target strengths), and (2) long-term properties (estimated 10-year strengths). For the base case material properties, linear elastic behavior is expected for earthquakes with recurrence intervals between 1000 and 5000 years. For the long-term material properties, linear elastic behavior is expected for earthquakes with recurrence intervals between 5000 and 10,000 years. No cracking of the concrete or sliding of the dam is expected in the linear elastic performance range for the dam.

#### *9.2. #2 Linear-Elastic Transitioning to Possible Localized Non-Linear Response with Limited Damage Beginning to Occur between the 1000- and 5000-Year Seismic Events*


#### Structural Analysis Results

The structural analyses confirmed that the transition into localized non-linear response with limited damage is a function of the RCC material strength. For the base material properties, this transition is expected between the 1000- and 5000-year earthquake events. By the time the structure is 10 years old, the transition is expected to occur for earthquakes ranging from 5000- to 10,000-year events. No sliding of the dam at any location occurs for a short or long duration 1000-year event for any of the assumed material properties.

#### *9.3. #3 Non-Linear Response, Moderate Damage, and Post-Earthquake Stability for Events Larger Than 5000-Year Return Periods*


#### Structural Analysis Results

The non-linear response was indicated beginning at an earthquake event having a recurrence interval as low as the 1000- to 5000-year when base case (1-year) strength parameters are used. The non-linear response was not indicated in the analyses for the 10,000-year earthquake loading when the long-term strength parameters were used corresponding to the expected RCC strength at about 10 years and beyond. Cracking and sliding displacements will have a preference to develop along the base of the dam with the potential for cracking and sliding at a location near the chimney section for an earthquake recurrence interval between 10,000 and 50,000 years. The occurrence and magnitude of cracking and displacements estimated are directly related to both the strength properties of the RCC and the duration of earthquake loading. Estimated displacement for the larger events where non-linear behavior is initiated is generally less than the specified design criteria.

Overall, the dam is expected to perform well for the full range of earthquakes considered in the structural analyses.

#### *9.4. #4 Post-Seismic Stability FOS > 1.0*

• Predicted for all loading conditions, including the 10,000- and 50,000-year events, when a reasonable lower bound residual friction angle of 35 degrees is assumed for the planes of sliding, and full uplift is applied linearly along the sliding plane as a full reservoir at the upstream heel of the dam and tailwater at the toe of the dam.

#### Post-Seismic Stability Analysis Results

A post-earthquake gravity analysis with the 2D model considering a small component of side shear stress, a residual friction angle of 35 degrees along a basal sliding plane, and uniform reservoir pressure along the entire length of the cracked base yielded a minimum FOS of 1.2. For all study cases, the design configuration results in a dam that will remain stable following all earthquake events considered in the analyses. Maximum displacements of 2 to less than 4 feet (0.61 to <1.22 m) were estimated for the maximum height sections with the 3D model for the 50 k, long-duration earthquake and base (1-year) material properties. Further analysis of the dam crest cracking potential with the 2D model estimated maximum potential displacements of 2 to 4 feet in the downstream direction for the most likely crack configuration. For a worst-case (very low probability) continuous crack sloping in the upstream direction, model simulations indicate a maximum upstream displacement potential for a small wedge of RCC material of 2 to 6 feet for the 50k, longduration earthquake. The likelihood of a toppling or overtopping event resulting from the cracking and sliding of the dam near the crest is considered remote.

**Funding:** The research summarized in this paper was funded by Clean Water Services as part of engineering studies to develop the RCC dam configuration presented in this paper. No other external funding was provided.

#### **Data Availability Statement:** Not applicable.

**Acknowledgments:** The following people made important contributions to the feasibility design of the Option 3 RCC dam, summarized in this paper.


The author would also like to express appreciation for Amin Hariri-Ardebili for his input and review of the section titled "A Review of Seismic Design and Dam Safety Risk Analysis of Concrete Dams." Portions of this paper were first published for the 2022 Annual Conference of the United States Society on Dams [54] titled "Risk Informed Design of a New Scoggins RCC Dam, Oregon Under Extreme Seismic Loading".

**Conflicts of Interest:** The author declares no conflict of interest.

#### **Appendix A Appendix A**

**Figure A1.** Example SRCH10-SP seeded and spectrum matched 10k-year ground motions, and Husid Plot considered during Feasibility Design Structural Analyses. Vertical lines on Husid Plot represent the time required to build from 5% to 95% of the Arias intensity [44]. The H1 component considered representative of upstream-downstream loading, and H2 as representative of crosscanon loading. **Figure A1.** Example SRCH10-SP seeded and spectrum matched 10 k-year ground motions, and Husid Plot considered during Feasibility Design Structural Analyses. Vertical lines on Husid Plot represent the time required to build from 5% to 95% of the Arias intensity [44]. The H1 component considered representative of upstream-downstream loading, and H2 as representative of cross-canon loading.

#### **Appendix B**

*Water* **2023**, *15*, x FOR PEER REVIEW 31 of 44

*Water* **2023**, *15*, x FOR PEER REVIEW 31 of 44

**Figure A2.** Scoggins RCC Dam: (**a**) Plan layout; (**b**) Profile (looking downstream). Stationing and elevation scales are in feet. Reference to notes on these figures are related to notes provided on the feasibility design drawings and have not been included here. **Figure A2.** Scoggins RCC Dam: (**a**) Plan layout; (**b**) Profile (looking downstream). Stationing and elevation scales are in feet. Reference to notes on these figures are related to notes provided on the feasibility design drawings and have not been included here. **Figure A2.** Scoggins RCC Dam: (**a**) Plan layout; (**b**) Profile (looking downstream). Stationing and elevation scales are in feet. Reference to notes on these figures are related to notes provided on the feasibility design drawings and have not been included here.

(**b**)

**Figure A3.** Representative maximum (non-overflow) cross-section of the Scoggins RCC Dam. **Figure A3.** Representative maximum (non-overflow) cross-section of the Scoggins RCC Dam.
