Predictive Archaeological Risk Assessment at Reservoirs with Multitemporal LiDAR and Machine Learning (XGBoost): The Case of Valdecañas Reservoir (Spain)

Cerrillo-Cuenca, Enrique; Bueno-Ramírez, Primitiva

doi:10.3390/rs17071306

Open AccessArticle

Predictive Archaeological Risk Assessment at Reservoirs with Multitemporal LiDAR and Machine Learning (XGBoost): The Case of Valdecañas Reservoir (Spain)

by

Enrique Cerrillo-Cuenca

^1,*

and

Primitiva Bueno-Ramírez

²

¹

Department of Prehistory, Archaeology and Ancient History, Complutense University of Madrid, 28040 Madrid, Spain

²

Prehistory Section, Department of History and Philosophy, University of Alcalá, 28805 Alcalá de Henares, Spain

^*

Author to whom correspondence should be addressed.

Remote Sens. 2025, 17(7), 1306; https://doi.org/10.3390/rs17071306

Submission received: 18 February 2025 / Revised: 31 March 2025 / Accepted: 3 April 2025 / Published: 5 April 2025

(This article belongs to the Special Issue Multi-Data Integration in Near-Surface Geophysics and Close Range Remote Sensing Applied to Cultural Heritage)

Download

Browse Figures

Versions Notes

Abstract

:

The conservation and monitoring of archaeological sites submerged in water reservoirs have become increasingly necessary in a climatic context where water management policies are possibly accelerating erosion and sedimentation processes. This study assesses the potential of using multitemporal LiDAR data and Machine Learning (ML)—specifically the XGBoost algorithm—to predict erosional and sedimentary processes affecting archaeological sites in the Valdecañas Reservoir (Spain). Using data from 2010 to 2023, topographic variations were calculated through a robust workflow that included the co-registration of LiDAR point clouds and the generation of high-resolution DEMs. Hydrological variables, topographic descriptors, and water dynamics-related factors were extracted and used to train models based on the detected measurement errors and the temporal ranges of the DEMs. The model trained with 2018–2023 data exhibited the highest predictive performance (R² = 0.685), suggesting that sedimentary and erosional patterns are partially predictable. Finally, a multicriteria approach was applied using a DEM generated from 1957 aerial photographs to estimate past variations based on historical terrain conditions. The results indicate that areas exposed to fluctuating water levels and different topographic orientations suffer greater damage. This study highlights the value of LiDAR and ML in assessing the vulnerability of archaeological sites in highly dynamic environments.

Keywords:

LiDAR; archaeological heritage; climate change; machine learning; predictive modelling; risk assessment

1. Introduction

The current conditions of global climate change [1,2,3,4] pose a risk to the preservation of cultural heritage assets, as many recent studies have highlighted [5,6,7,8,9,10]. The impacts of climate change on archaeological assets can be considered direct when transformations in land cover (such as changes in the coastline [11,12,13,14,15,16], vegetation [17], or permafrost [18,19,20,21,22]) alter the deposits where the archaeological sites are located. Indirect consequences involve modifications stemming from shifts in the intensity of natural resource management—a phenomenon further compounded by policy interventions aimed at addressing the outcomes of climate variability [23]. As an example of the latter, droughts and the increasing aridification of certain regions, such as southern Europe [24,25], drive notable changes in how water resources are managed [26,27,28,29,30].

The summer of 2021 was a clear example of this trend, as evidenced by the large number of news reports covered by the media, with a notable focus on the preservation of archaeological sites. This abundance of news reflects the growing public interest in protecting archaeological heritage in the face of an exceptional transformative situation. However, it is important to note that other factors also contribute to the vulnerability of archaeological sites, such agricultural intensification, looting, and inadequate management practices. This underscores the need for authorities to adopt integrated strategies for monitoring and mitigating a wide range of threats to cultural heritage.

1.1. Objectives

Global warming and water management policies [31,32,33,34,35,36] have intensified the frequency and magnitude of terrain changes in artificial water reservoirs, altering the dynamics of sedimentation and erosion. Another emergent trend observed in reservoirs is the sharp descent of water levels to unusually low levels. These variations are exposing archaeological sites to risks such as erosion and looting [37,38,39]. In fact, UNESCO considers erosion a threat to archaeological sites in the context of current climate change [40]. These threats demand the design of tools for identifying and reducing these effects on archaeological assets. Remote sensing and predictive modelling offer a framework to address these needs over large areas, such as reservoirs [41,42,43,44].

In this work, we focus specifically on water resource management [33,34,35], although the remote sensing and data analysis procedures discussed can be applied to other types of situations and scenarios. In this study, we present a comprehensive assessment strategy for evaluating the impacts on the archaeological heritage of the area, building on our previous research and drawing on similar experiences in reservoir management and archaeological site studies [38].

The questions that remain to be answered are how erosional and sedimentary processes affect the preservation of archaeological sites and how monitoring and modelling strategies for topographic changes can be developed.

Specifically, water resource management practices in lakes and artificial reservoirs often lead to significant changes in soil properties. In simple terms, when managing these water bodies—for example, by varying water levels—the natural structure of the soil can be disturbed, which often results in increased erosion [45,46]. Droughts expose archaeological sites that have been submerged under artificial lakes over the past 100 years to the open air in unprecedented ways, accelerating their continuous degradation [37,39,41,42,43,44,47]. The apparently erratic fluctuations in water levels lead to more pronounced erosive and sedimentary dynamics [46], which result in poorer preservation conditions for the archaeological sites [37].

The diagnosis and protection work of such cultural heritage sites is challenging for several reasons. The most decisive factor is the inability to determine the number of archaeological sites affected by reservoirs—either because the surveys conducted before the flooding were not exhaustive, or because many archaeological features remain undetected without the use of appropriate subsurface exploration techniques (e.g., geophysical prospecting [44]). Another challenging factor is the looting of exposed archaeological assets, which, in most cases, lack adequate protection [37]. These events are unpredictable because reservoir level fluctuations can suddenly expose archaeological sites, creating opportunities for looters, and the absence of continuous surveillance further complicates their timely detection. Consequently, monitoring reservoir basins becomes the most suitable tool for decision-making for the authorities involved in heritage preservation. One valuable monitoring method is remote sensing [41,42,43,44], which enables the systematic and precise observation of the changing conditions affecting archaeological sites across large areas.

1.2. The Case of the Valdecañas Reservoir (Cáceres, Spain)

In 2019, the Dolmen of Guadalperal (Figure 1), a Neolithic collective sepulchre, emerged from the Valdecañas Reservoir (Spain) (Figure 2), an event covered by many global outlets between 2019 and 2022 [48,49,50,51,52]. This archaeological site was excavated by the Spanish–German archaeologist Hugo Obermaier between 1925 and 1927 [53], before the Valdecañas Reservoir submerged it again in 1963 [54]. From that time until 2019, the monument emerged only twice due to exceptional circumstances, one being a severe drought in 1992 [54,55]. However, since 2019, this phenomenon has occurred four times, which is an unusually high frequency [44].

Given this unusual situation, the Spanish Ministry of Culture commissioned us to study the monument and to implement strategies for protecting the site and monitoring risks. The Valdecañas Reservoir, built for hydroelectric generation, was constructed on the Tagus River—the longest and highest discharge river on the Iberian Peninsula. One of the challenges for preservation is the vast area covered by the reservoir, spanning approximately 7300 hectares, with a total capacity of 1446 cubic hectometres [56]. The reservoir’s maximum level is set at 315 m above sea level [56], but during drought periods, it has dropped to as low as 294 m above sea level. This fluctuation in water levels exposes large tracts of land, uncovering numerous archaeological sites located in areas selected for their fertile soils and closeness to the river.

Initial estimates indicated a total of 82 archaeological sites distributed across various parts of the reservoir [44,57], particularly along the shoreline areas near the reservoir’s edge, as the deeper parts of the reservoir have remained consistently submerged. Undoubtedly, this initial catalogue represents only a small fraction of the site’s actual archaeological potential.

The Valdecañas Reservoir is an ideal setting for this study because it exhibits a recurrent cycle of exposure and erosion of archaeological evidence. Its analysis allows us to test the effectiveness of remote sensing tools that were not available even a decade ago, tools which are useful for assessing risks and designing preservation strategies for these archaeological sites.

2. Materials and Methods

Modelling the erosive and sedimentary behaviour of a reservoir is a complex objective that extends beyond the initial scope of this study. Variables such as the intensity of agricultural land use or precipitation rates directly influence the sedimentary behaviour of reservoirs—aspects that fall outside our expertise as archaeologists. Rather than attempting to model the detailed processes of erosion or sediment deposition, our goal is to assess the extent to which erosive and sedimentary processes affect the preservation of archaeological sites [58,59]. In other words, we aim to model the outcomes of these processes rather than the complex processes themselves. This approach allows us to identify areas with greater dynamism, which, in principle, have a direct impact on cultural heritage preservation. The workflow for our methodology is summarised in Figure 3.

2.1. Description of the Datasets

For this study, we rely on the following initial data, which we briefly describe and relate to the analysis of the problem:

Archaeological site database: A comprehensive geodatabase of archaeological sites documented through surface surveys and bibliographic sources. These sites may present the following initial conditions: being in areas of permanent flooding, having completely disappeared, or remaining preserved but situated in areas that are exposed or submerged at varying intervals. As is common in archaeology, the data exhibit certain inconsistencies and inaccuracies when defining the extent of the sites. We opted to calculate the centroids of the available polygons and create a 100 m buffer zone around them, providing a standardised area to monitor and estimate erosive and sedimentary processes in a homogeneous manner. A total of 82 sites were considered, spanning from the Lower Palaeolithic to the Middle Ages (Figure 2). The Supplementary Material Data S1 lists the 40 study sites (see Section 3.2 for selection criteria). Their names have been anonymised with official IDs and their coordinates withheld due to sensitivity;
Reservoir water level records: Daily historical records of the reservoir’s volume from 1970 to 2021, provided by the Tagus River Basin Authority, were used to infer the reservoir’s water level. These records have been employed to identify areas of the terrain most exposed to waterline impact and to create a shoreline recurrence map, as discussed in Section 2.4.1. A preliminary analysis that shapes the focus of this work reveals a significant trend: from 1970 to 2000, decade-long moving averages show smooth patterns of water level increases and decreases with some seasonality. However, starting in 2005—and more prominently from 2010 onward—peak water levels tend to occur in spring (around April–May), followed by a sharp drop to a minimum in August (Figure 4 and Figure 5). This trend does not appear to be explained solely by drought conditions, but rather by a convergence of water-management policy factors [36];
Historical DEM (1957): A digital elevation model (DEM) generated from the photogrammetric restitution of a series of aerial photogrammes from a USAF flight conducted in 1957 [60,61]. This DEM serves primarily illustrative purposes and, at best, provides low-quality estimates, as significant errors have been identified when compared to higher-precision data [54]. The DEM was generated using Agisoft Metashape Pro. Depth maps were converted into point clouds, which were cleaned of outliers and then interpolated into a DEM using a kriging algorithm. Its spatial resolution was forced to 5 m/pixel (Figure 6). The inability to obtain ground control points in areas now flooded is the main factor limiting the accuracy of this model;
LiDAR data: To compare the point clouds over time, we used LiDAR datasets from the Spanish National Geographic Institute, which regularly provides LiDAR coverage of the national territory with varying resolutions and accuracies. For the study area, three coverages are currently available, captured in 2010, 2018, and 2023. The files from 2010 and 2018 cover an area of 2000 × 2000 m, while the 2023 coverage spans 1000 × 1000 m. The 2010 and 2018 coverages are final, fully processed products, whereas the 2023 coverage was automatically classified, which may have introduced point misclassification. The technical specifications and flight intervals can be found in Table 1. It was not possible to obtain more precise data on the flight times, even by extracting timestamps from the metadata of the LAZ files.

2.2. Cleaning and Co-Registration Processes for LiDAR Point Clouds

The LiDAR files were downloaded as compressed point clouds in the LAZ format and processed using the Python 3.12 programming language, with specific libraries such as laspy, Open3D [62], and Whitebox tools [63]. To retrieve bare-earth information, the first step involved filtering all points classified as ground in the LAZ files, following the classification standards of the ASPRS. Points classified as vegetation, buildings, and erroneous points were removed. Subsequently, point co-registration across the different coverages was performed. Co-registration minimises the differences between point clouds, enabling the optimal alignment of point geometry [64] and resulting in more accurate comparisons between datasets. To perform co-registration, one needs a reference point cloud and another point cloud of an identical extent to transform. We selected the 2018 point cloud as the reference (Figure 7) because it was the best-classified coverage, had a higher resolution than the 2010 coverage, and was chronologically intermediate between the 2010 and 2023 coverages.

To perform the co-registration, a voxel sampling technique was applied to improve the alignment of the 3D point clouds [65,66,67], thereby enhancing computational efficiency and alignment accuracy. Given the differences in point densities, a voxel size of 0.5 m was chosen for the 2010 coverage and 0.25 m for the 2023 coverage. This approach reduces data complexity by representing dense point clusters with single, representative points, while preserving the geometric features needed for precise alignment. This procedure is also useful when comparing point clouds with different resolutions.

Next, the optimal geometric transformation (consisting of translation and rotation) was computed using a least squares method to align each non-reference point cloud to the 2018 reference. The point-to-point ICP algorithm was used for this purpose [64,68,69,70]. This algorithm minimises the distance between points in both clouds using a k-d tree to optimise the search process.

For this purpose, we used the ICP implementation from the Open3D library [62]. For the co-registration of the point clouds [71,72], we set an alignment tolerance threshold of 0.5 m and a maximum of 400 iterations. Finally, we calculated the alignment fitness coefficient and the RMSE of the inliers (the valid paired points).

The result is a matrix (T):

T = [\begin{matrix} R & t \\ 0 & 1 \end{matrix}]

where R denotes rotation and t denotes translation, which are applied to the point cloud to be transformed.

2.3. Interpolation and Generation of DEMs from LiDAR Point Clouds

Once the point clouds were co-registered and filtered, they were interpolated using the IDW algorithm [73] implemented in the Whitebox library, employing a resolution of 1 m, which is the highest feasible resolution considering the point density provided by the 2010 coverage. After interpolating the point clouds into rasters, all rasters were merged to form a single DEM per coverage year. We then excluded areas corresponding the reservoir flood zones from each DEM, as they contained anomalous data. Specifically, areas up to an elevation of 301 m a.s.l. were masked out, since 301 m was the minimum common elevation among all three LiDAR coverages.

It is essential to validate the accuracy of the models before implementing a more detailed analysis. To achieve this, we selected a random sample of 1000 points above the reservoir’s maximum elevation (315 m a.s.l.), ensuring that the sampled terrain did not have significant real topographic differences across the datasets. For validation purposes, the height values of each sample point were subtracted between each pair of DEMs (2010–2018, 2010–2023, and 2018–2023). We analysed the distributions of these differences and also calculated the Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) of the differences between each DEM pair.

2.4. A ML-Based Predictive Model to Calculate Erosive and Sedimentary Processes Caused by the Reservoir’s Impact on Archaeological Sites

2.4.1. Generation of Explanatory Variables

Considering the error and accuracy values, we conducted a statistical analysis of the data, their temporal trends, and spatial distributions. We generated a series of explanatory raster layers (Table 2) based on the 2018 DEM. The 2018 dataset was chosen because it served as the reference for the co-registration and represents an intermediate state between the 2010 and 2023 data. All calculations were implemented using the Whitebox library, which offers extensive documentation and configuration options. The selected explanatory variables are listed in Table 2.

Table 2. Variables used for training the model, description, and reasons for inclusion.

Variable	Description	Reason for Inclusion	Ref.
Slope	Determines the speed of surface water flow. Steeper slopes accelerate the flow, increasing the water’s erosive capacity.	Critical for identifying areas prone to erosion due to faster water movement on steep slopes.	[74]
Aspect (terrain orientation): sine and cosine of aspect	Indicates the orientation of terrain cells toward cardinal directions (0–360°). To avoid errors during model training (e.g., treating 359° and 0° as distant), this variable is encoded into two circular continuity variables: the sine and cosine of the aspect.	Helps understand how terrain orientation affects water flow direction and erosion patterns, especially in areas with predominant flow from certain directions.	[75]
Profile Curvature	Describes terrain convexity or concavity in the direction of the steepest slope. Concave areas accumulate water and sediments, while convex areas are more exposed to erosion.	Identifies areas where sediments are likely to deposit (concave) or be eroded (convex), directly impacting archaeological site preservation.	[76]
Planimetric Curvature	Reflects terrain curvature perpendicular to the steepest slope. Identifies areas where water flow converges (greater erosion) or diverges (less erosion).	Useful for locating zones with concentrated erosion due to flow convergence or minimal erosion due to flow divergence.	[77]
Terrain Ruggedness Index (TRI)	Measures elevation variation within a local area. Higher ruggedness slows water flow, potentially increasing erosive process accumulation.	Determines how terrain heterogeneity influences water flow and erosion potential in rugged areas where archaeological sites might be exposed.	[78]
Topographic Wetness Index (TWI)	Estimates the relationship between flow accumulation and slope. Areas with higher flow accumulation may promote soil erosion.	Helps assess areas with increased risk of erosion where water accumulation and slope interact, affecting preservation conditions.	[79]
Sediment Transport Index (STI)	Evaluates sediment transport capacity based on flow accumulation and slope. Areas with high STI values are often critical for sedimentation.	Indicates areas where sediment deposition could bury archaeological sites, potentially protecting them or affecting their accessibility.	[80]
Flow Accumulation (SCA)	Evaluates the potential sediment transport capacity based on flow accumulation and slope. Areas with high flow accumulation are critical for material deposition.	Highlights areas of material deposition or erosion, crucial for understanding changes in sedimentation near submerged or exposed sites.	[81]
Stream Power Index (SPI)	Combines slope and flow accumulation to estimate water’s erosive power and potential erodibility. Identifies areas with high erosion potential where steep slopes and flow accumulation converge.	Identifies high-risk erosion zones, particularly where steep slopes intersect high water flow, impacting site preservation and exposure.	[80]
Flood Frequency	The proportion of days a pixel in the DEM was submerged under water. Calculated by dividing the number of submerged days by the total days in the series. Areas submerged for longer are less exposed to erosion. Normalised to [0, 1].	Helps determine how prolonged submersion influences site preservation, as submerged areas are less affected by erosion, but may face other degradation factors.	Figure 8 and Figure 9
Shoreline Position Frequency	Proportion of days the water level intersects a pixel (within the DEM error margin). Calculated by dividing the number of shoreline days by the total series days. Areas with higher recurrence are more prone to erosion. Normalised to [0, 1].	Indicates erosion-prone areas due to frequent water level fluctuations, directly impacting archaeological sites located near the reservoir’s shoreline.	Figure 9

Figure 8. Map showing the number of days that each location was flooded in the interval 1970–1921 in Valdecañas Reservoir (Spain).

Figure 9. Example of frequency calculations using reservoir records and DEMs. In this case, the 2018 DEM is used as the base, and the interval considered is from 2019 to 2021. (a) Normalised frequency of the number of days that the reservoir locations remained submerged. (b) Normalised frequency of the number of days that the shoreline remained in a specific position.

For the calculation of the last two variables (“Flood Frequency” and “Shoreline Position Frequency”), two sets of calculations were performed: one for the 2010–2018 interval and another for the 2018–2023 interval. The following process was adhered to: daily elevation data were used for each interval. The uncertainty in the date of LiDAR data acquisition and the break in the water-level data series in 2021 reduce the overall accuracy of these frequency variables, so we normalised both Flood Frequency and Shoreline Position Frequency (Figure 8). For the 2010–2018 period, data were considered from 1 August 2018 (approximate start date of the LiDAR flights) to 31 December 2018 (an intermediate date for the 2018 coverage). For the 2018–2023 period, data from 1 January 2019 to mid-2021 were used; the 2018 DEM was used for the first period’s calculations, and the 2023 DEM for the second. This choice was made to better reflect the observed terrain changes at the time of calculation, as both DEMs were derived from denser and more accurate point clouds for those intervals.

2.4.2. ML-Based (XGBoost) Model: Definition and Training

To create a predictive model, we used a Machine Learning algorithm, XGBoost [82], which has a Python implementation. XGBoost is an optimised version gradient boosting framework for decision trees and is highly effective for analysing large datasets due to its computational optimisation and parallelisation. Among its advantages is the use of regularisation parameters, such as lambda (L2 regularisation) and alpha (L1 regularisation), which penalise tree complexity and prevent the overfitting seen in other gradient boosting models. We chose XGBoost over other regression algorithms due to the non-linear nature of the addressed classification problem. Additionally, XGBoost it is more interpretable than other classification techniques like neural networks, yet more interpretable than a simple linear model. This interpretability makes it easier to evaluate the importance of the variables used by the model.

We created two different XGBoost models: one reflecting terrain variations between 2010 and 2018 (Model 1) and the other for 2018 and 2023 (Model 2). Both models use the differences between the DEMs as the response variable and the 12 explanatory variables previously described in Section 2.4.1.

One of the issues we identified is the need to account for DEM accuracy in the model design. In other words, it is not feasible to predict the response variable (“difference between models”) within value ranges that are likely dominated by noise factors such as LiDAR data accuracy or registration errors. For this reason, the accuracy values obtained in Section 2.3 were used as thresholds to implement two XGBoost models for Model 1 and Model 2: two models for absolute difference values below the DEM accuracy threshold (models 1A and 2A) and another for cells with absolute differences above this threshold (Models 1B and 2B). We also excluded outliers whose absolute difference values exceeded 1.5 times the IQR. This approach resulted in the models described in Table 3.

To fine-tune the XGBoost model parameters, we used grid search optimisation in Python’s scikit-learn library [83], leveraging XGBoost’s API. The following hyperparameters were systematically tested and optimised by minimising the RMSE, to balance model complexity and performance.

Finally, we used the SHAP tool [84] to interpret the XGBoost models’ outputs and determine the effects of each variable on the predictions. The SHAP values helped to quantify how each variable influences the model’s output for each instance.

3. Results

3.1. Co-Registration of Maps and Accuracy Estimation

The results of the co-registration processes are shown in Table 4. Overall, the average alignment shows fitness values exceeding 0.8 (on a 0–1 scale) and relatively low standard deviations (Table 4), which indicates optimal alignment. The RMSE values for inliers (point pairs considered a good match) average around 0.28 m, with a standard deviation of just 0.01 m. These statistics show that the co-registration provided precise alignment and that the resulting point clouds can be reliably compared, with a conservative error margin of 0.3 m.

The highest MAE values are observed in comparisons using the 2010 DEM as a reference, though the difference is barely 0.05 m. The 2010 dataset’s lower point density and vertical accuracy likely contribute to the observed differences. The lowest MAE occurs between the 2018 and 2023 coverages, although differences in the RMSE are more pronounced, possibly due to anomalous points or differences in classification among LiDAR datasets. Nevertheless, the errors are consistent and fall within the RMSE ranges provided as references by the Spanish National Geographic Institute, and align with those obtained during the co-registration phase.

Finally, a post hoc comparison test was performed between ΔZ variables using the Mann–Whitney U test with Bonferroni adjustment, which is suitable for comparing independent samples. Bonferroni correction further reduces the risk of Type I errors when performing multiple comparisons [85]. The results are as follows:

ΔZ_2010–2018 (x, y): U = 414,537, p = 0.965, p_adjusted = 1

ΔZ_2010–2023 (x, y): U = 414,599, p = 0.961, p_adjusted = 1

ΔZ_2018–2023 (x, y): U = 414,130, p = 0.994, p_adjusted = 1

It can be concluded that no statistically significant differences are found between the elevation values derived from the DEMs. The distributions of differences are highly similar, and the variations within the emerged area are not large enough to be statistically detectable.

The results confirm that the comparisons between the 2018 and 2023 DEMs are more accurate and robust than those involving the 2010 model. Based on the statistical indicators provided (fitness values consistently above 0.8, RMSE ~0.28 m, and standard deviations ~0.01 m), the co-registration and resulting comparisons can be considered highly reliable. A margin of error of ±0.3 m is a reasonable and conservative estimate for further analytical use. However, comparisons involving the 2010 DEM can still be significant, provided that they clearly exceed the 0.15 m MAE established as a general criterion, while accounting for potential local variations.

To facilitate qualitative geographic analysis, we created composite RGB raster layers from the DEM differences (ΔZ). Specifically, the ΔZ_2010–2018 raster was assigned to the red channel, ΔZ_2010–2023 to the green channel, and ΔZ_2018–2023 to the blue channel. This produced RGB images highlighting topographic changes. For instance, an intense blue colour may indicate more significant changes that occurred during the 2010–2023 interval. In this case, such a pattern would be the most expected, as it reflects a longer time span, highlighting continuous processes over time. Figure 10 illustrates this procedure.

3.2. Results of the Evaluation of Topographic Changes in Archaeological Sites

For a more effective comparison, we considered only those archaeological sites that met the following criteria: (1) sites located partially or entirely below the reservoir’s maximum flood level (315 m) and (2) sites whose areas were partially located above an elevation of 301 m, the minimum common elevation for all three DEMs. After analysing the database, 40 sites met these conditions (Figure 11). In Supplementary Material Data S1, these sites are listed along with their chronology and basic typology, as well as a brief historical discussion for each group of sites.

Of the 40 archaeological sites analysed, 37.5% (15 sites) have more than 50% of their area within the 301–315 m.a.s.l. range, meaning that they are recurrently exposed to reservoir fluctuations and within the elevation range captured by LiDAR datasets (Figure 12). We analysed whether the distributions of |ΔZ_2010–2018|and |ΔZ_2018–2023| show significant variations at these sites. To do so, a pixel count within the [0.3, 1] m interval was used, ensuring that we focussed on changes above the noise level (≥0.3 m) and minimising the effect of extreme outliers (>1 m), which could be artefacts. A Mann–Whitney U test yielded U = 743.5 and p = 0.59, indicating that the difference between the two distributions is not statistically significant. In other words, the more recent variations in the reservoir’s hydrological regime do not appear to have significantly influenced topographic changes greater than 0.3 m between both periods.

However, at marginal sites (e.g., site 500,018), average elevation losses of −0.48 m were recorded between 2010 and 2023, with 5356 cells showing a ≥0.3 m loss of sediment. This pattern intensified after 2018: for example, site 500,169 experienced an average erosion of −0.50 m, with 458 cells showing changes of >1 m (likely due to data artefacts or outliers). These results underscore that transition zones—between the waterline and the land—are particularly vulnerable, as hydrological fluctuations accelerate degradation in these areas. These processes have two repercussions for archaeological assets [37,39]: the loss of sediment—resulting in the exposure of elements to the open air or their transport (depending on their weight)—and the deflation that causes the loss of binding material in the construction materials of buildings, leading to their eventual collapse.

3.3. XGBoost Predictive Models

Models 1B and 2B were trained on cells with absolute values between 0.3 and 1 m. While the outliers of the absolute values of the distribution reach ~7.56 m in Model 2B (values > Q3 + 1.5 × IQR), practical interpretation suggests that values of > 1 m can already be considered infrequent. In fact, the upper bound for the same distribution in Model 1B is ~1.2 m. Thus, we limited the training data to |ΔZ| ≤ 1 m for both models.

Given the large volume of input data for the model—35 million pixels, with 12 explanatory variables and one response variable—random sampling was performed: 10% of the data for models 1B and 2B, and 1% for models 1A and 2A, which still yielded sufficient sample sizes for training. Table 5 lists the number of cells used for training in each model. For the XGBoost configuration, we used a learning rate (α) of 0.01, a maximum tree depth of 10, up to 300 estimators, and a lambda (L2 regulatization) value of 1. The performance of the trained XGBoost models is presented in Table 5.

Most of the training data for Models 1A and 2A come from areas with very small DEM differences (<0.3 m, near the global noise threshold). These models show poor performance (Table 5), with low R² values indicating that they explain a small percentage of the variance in the data (0.096 for Model 1A and 0.243 for Model 2A). Despite the low and consistent RMSE values (0.06), likely influenced by a local high agreement between the DEMs, these results can confirm that the data in the ±0.3 m interval are effectively noise.

Model 1B can explain 43% of the variance in the data (R² ≈ 0.43). In contrast, Model 2B achieves a higher R² coefficient (0.685), indicating a considerably better ability to explain the variance in DEM differences. An R² near 0.685 is quite acceptable, considering that the differences between DEMs may be influenced by factors that our model does not explicitly include (e.g., human activities).

Figure 13 and Figure 14 illustrate the gain assigned by the XGBoost models “Model 1B” and “Model 2B” to the variables based on their contribution to explaining the variance within the models. Table 6 summarises the relative importance (gain) of each variable in both models and their rankings.

Overall, there is consistency in the most important variables across both models. Notably, the “cosine of aspect”, a variable that defines the terrain’s orientation relative to the cardinal directions, stands out in both models. This variable is even more influential in Model 2B than in Model 1B, which could be attributed to a combination of factors: the higher resolution of the data used in Model 2B, as well as temporal dynamics in the data that we are cannot clearly characterise. Moreover, the “sine of aspect” is also among the top variables in Model 1B. Both variables are uncorrelated (r_s = 0.07, p < 0.001), so the collinearity seems not to be the reason for the high values.

The “cosine of aspect” exhibits the highest gain of all, meaning that each time a split is made in the decision tree, it results in a significant reduction in error while affecting a large amount of data in Model 2B. The SHAP analysis (Figure 15) reveals that certain terrain orientations favour an increase in model outputs. The variable “days submerged” is used frequently (high weight values), suggesting that the number of days during which a location has remained flooded is relevant in the decision-making process. However, its average gain is lower than that of aspect. In the SHAP analysis, higher values of “days submerged” lead to higher predicted values (Figure 15), which could indicate that areas less frequently exposed (more often underwater) tend to experience lower erosion and more sedimentation.

“Profile Curvature” is used frequently in the models (high weight) (Figure 15), although its gain is not particularly high. Slope and waterline frequency exhibit moderate gains; despite their being frequently used, their contributions to the model’s gain remain moderate.

3.4. Assessment of Potential Erosion and Sedimentary Processes on Archaeological Sites

Once the regression fit of the model was verified, we decided to apply Model 2B to predictions within the buffers of the archaeological sites described in Section 3.2, specifically a 100 m buffer around the site centroid. This choice was based on the better fit of Model 2B (R² = 0.685). Predictions were made by computing all explanatory variables on the 2023 DEM following the methodology described in Section 2.3, and then applying the trained model to those variables at each site’s location.

For these predictions, we considered only the emerged areas in the 2023 DEM with an elevation below 315 m—the maximum flood level of the reservoir—to ensure credible results. To facilitate interpretation, we classified the predicted topographic change values into categories. The largest and most common category is changes within the range [−0.3, 0.3], which are subject to measurement errors. The remaining categories are as follows: [−1, −0.65] for high erosion, [−0.65, −0.3] for moderate erosion, [0.3, 0.65] for moderate sedimentation, and [0.65, 1] for high sedimentation. No high erosion values were predicted, with this being the reason why Figure 16 and Figure 17 do not show the [−1, −0.65] category for erosion.

In general (Figure 16), the erosion values predicted by the model tend to be present in relatively low percentages within the [0.3, 0.65] and [−0.3, −0.65] intervals. The neutral range (≤|0.3| m) encompasses a large proportion of the predictions at many sites. The prediction of moderate sedimentation, within the [0.3, 0.65] interval, tends to be more representative than erosion, though there is notable variation between different sites. Overall, the predicted sedimentation values at archaeological sites are higher than the predicted erosion magnitudes, and both types of changes (positive or negative) tend to be of moderate intensity. However, Figure 17 highlights several cases where certain sites show pronounced variations in both the erosion and sedimentation categories.

3.5. Modelling on a Historical DEM: A Tentative Approach

The modelling of variables using the 1957 DEM serves purely as an illustrative demonstration of the model’s capabilities and as an exploratory method for estimating topographic changes in areas not covered by current LiDAR (i.e., terrain that has remained submerged most of the time). There are three main limitations to this exercise: (1) the differences in DEM resolution (1957 DEM at 5 m vs. 2018 DEM at 1 m), which affects the comparability of the results [86], (2) the low precision of the 1957 DEM itself, and (3) the fact that the hydrological regime characteristics used to train Models 1B and 2B have been inconsistent over time, as evidenced by the previously discussed historical data series (Figure 4). Moreover, both models are trained on 5-year interval changes. Therefore, extrapolating the model to a ~50-year interval is not possible, due to significant differences in water management over the decades, as well as numerous natural and anthropic processes like the effect of cumulative erosion, or changes in land use. In short, natural and anthropogenic changes over a 50-year period introduce significant variability that the model trained on 5-year data cannot accurately capture.

Even despite having a basic characterisation of the accuracy of the 1957 DEM [44], we performed a new accuracy estimation using the 2018 DEM as a reference. We sampled 1000 random points in non-floodable areas and compared the 1957 and 2018 elevations at those points (Figure 18). The results indicate a bias of −2.27 m, which translates into an underestimation of the 1957 DEM’s altimetry compared to the 2018 DEM. The RMSE is 8.16, and the MAE is 3.27, with a standard error deviation of −7.91 m, suggesting significant differences between the models. A Wilcoxon signed-rank test for paired samples confirmed that these differences are statistically significant (W = 66,552, p < 0.001), so we can reject the null hypothesis and conclude that the two DEMs differ in a non-random way. In other words, the bias and discrepancies are systematic, not due to chance.

Since applying our earlier predictive models directly to the 1957 DEM would yield spurious results, we decided not to use Model 1B or 2B for predictions on the 1957 data. Instead, for purely descriptive rather than predictive purposes, we conducted a multicriteria analysis based on WLC [87]. In this approach, we combined variables weighted by their importance. We took the importance (gains) of the four most important variables from Model 2B (as per Table 6), normalised them, and used these weights to linearly combine those variables on the 1957 DEM. The sign of the effect (positive or negative) was derived from the SHAP analysis. Table 7 shows the normalised weights and effect signs for the four variables.

Using these weights, we generated the WLC raster with the following formula:

WLC raster = (−0.49 × aspect_cos) + (0.32 × days_submerged) + (−0.11 × slope) + (0.09 shoreline_frequency)

However, the resulting raster presents several limitations. First, it is a linear combination of variables, which is inconsistent with the inherently non-linear nature of a XGBoost model. Secondly, determining the influence of each variable in the new combined raster can be complex, especially at local scales, where an XGBoost approach is able to combine multiple variables with different weights and interactions. Finally, the results of the raster do not represent predicted values for topographic changes, but rather a linear mixture of factors, which complicates interpretation. The WLC raster can be considered a tool for highlighting areas predisposed to erosion or sedimentation under the scenario represented by the 1957 DEM, which itself has accuracy issues (Figure 19).

Interestingly, the WLC raster results (Figure 19) show clear coherence with the visual observations recorded at field capacity. Former river channels (the pre-reservoir drainage network) appear to concentrate the highest (positive) values, suggesting sedimentation, while steep areas show values indicative of erosion. These types of predictions can be qualitatively confirmed in the most significant sites within the study area, such as the Dolmen de Guadalperal [44,53,54] (Figure 20) or the surroundings of the Roman city of Augustóbriga [88], at old Talavera La Vieja town (Figure 21). In both cases, as indicated by the WLC raster, the greatest erosion occurs along the edges of elevated areas, whereas the plateau surfaces where the archaeological sites are located experience both erosion and sedimentation processes of varying intensities.

4. Discussion

4.1. Overall Limitations of the Proposed Methodology

This methodology presents several limitations. One major limitation is the restriction of the study area to zones that were exposed at the time of data acquisition, which confines the analysis to a narrow band of the reservoir (specifically between 301 and 315 m a.s.l.). As a result, predictions for areas that emerge less frequently may be unreliable. However, some of these areas, such as the site of the Dolmen de Guadalperal, located at 296 m a.s.l., are of particular interest. A theoretical solution for this is the use of DEMs derived from aerial photographs taken before the reservoir’s flooding, as discussed in Section 3.5 with the 1957 DEM.

Another challenge is the precision of the methodology. With the current workflow, it is only feasible to detect processes (1) producing topographic changes greater than about 0.3 m, and (2) causing consistent alterations with some continuity or repetition over 5-year (or longer) periods. The fact that our RMSE threshold is ~0.3 m complicates the characterisation of subtler patterns, and may obscure more dynamic or complex erosional and sedimentary processes than those described here. This limitation is particularly critical in cases where detecting smaller changes (e.g., consistent erosion of 0.1 m over consecutive 5-year intervals) could, over time, lead to the degradation or loss of an archaeological site.

The RMSE value is consistent across products generated from different LiDAR coverages, even for the denser datasets from 2018 and 2023. Future improvements in the accuracy and density of LiDAR point clouds may help achieve greater resolution in detecting small topographic changes. For now, constant monitoring of the most significant archaeological sites using higher-resolution geometric records is essential, as demonstrated in the case of Dolmen de Guadalperal [27]. In this case, a combination of aerial photogrammetry, LiDAR, and multibeams allowed for the identification of minor changes in multitemporal point clouds, with discrepancies in the RMSE as small as 0.05 m. The use of georeferenced markers, such as those employed in coastal dynamics analyses [89,90], which can estimate sediment loss or accumulation, could be another way to validate the model results.

Finally, although we can estimate each variable’s contribution to the model, the causes of the differences observed between the models cannot be straightforwardly explained. This limitation often arises from factors such as human activities and specific land uses in flood-prone areas and their surroundings. Additionally, some differences stem from categorical variables (e.g., soil types, lithology, or the typologies and chronologies of archaeological sites), which would require developing separate models for each situation.

4.2. The Potential of Quantitative Analysis

This study demonstrates from a methodological perspective that the multitemporal analysis of LiDAR data series must be approached with robust methodologies for comparison between point cloud-derived DEMs. Data must be accurately co-registered, and the accuracy of the differences between models must be clearly and statistically defined to ensure proper validation. As LiDAR point cloud resolutions and accuracies improve, it will become possible to detect smaller variations, which will also contribute to the training of more predictive ML models.

Having a constant measurement and monitoring tool such as LiDAR data allows for a clearer understanding of the changes affecting archaeological sites and their magnitudes. Repeated observations with higher resolution should improve the results, which for now serve to flag which of these sites are in greatest danger.

The use of ML is another aspect introduced in this study to estimate topographic changes. The models used to predict errors beyond the margin of error inherent to data acquisition methods have shown consistency. While Model 1B exhibits a somewhat poor fit, Model 2B demonstrates a more robust adjustment based on R² values, which are more reliable than metrics such as MAE and RMSE [91]. The disparity in fit between Models 1B and 2B is not entirely clear, but one factor to consider is the LiDAR point density; the higher point density in the later dataset may allow for a more accurate terrain representation, and thus, a better characterisation of erosional and sedimentary processes.

In any case, the proximity of Model 2B’s R² value to the 0.7 threshold—often considered an indicator of a model’s robustness in some scientific fields—demonstrates the effectiveness of the workflow we have designed and suggests that the observed variance is, to some extent, predictable. As we have stated, the comprehensive modelling of local factors influencing erosional and sedimentary processes may be virtually impossible. Therefore, achieving an R² of 0.685 may be considered reasonably satisfactory.

Additionally, identifying the most important variables that contribute to topographic alterations is a positive outcome. While it is evident that variables such as the slope or frequency of flooding can play a role in topographic changes, more subtle factors—such as terrain orientation, which appears to have the greatest weight in the models—are a key contribution of this study. The effect of this variable could be attributed to the dynamics of reservoir filling and emptying. These insights are useful because they allow us to identify which factors are most critical for the protection of the sites, thereby enabling us to better prioritise conservation actions. Furthermore, knowing that the most decisive variable is terrain aspect allows us to develop more specific protection strategies tailored to each type of terrain. However, the identification of these factors needs to be further tested in other contexts to confirm this pattern.

Certainly, confidence in a model’s predictive capability should never be used as an isolated tool in the management of archaeological heritage. No matter how well calibrated the model predictions may be, the same magnitude of changes can have unequal effects on different archaeological sites.

Finally, this study has aimed to provide a “retroactive” perspective on how the most significant factors may have altered our current understanding of the distribution of archaeological evidence within reservoirs. To achieve this, a low-accuracy DEM generated from photogrammes from 1957 was used to identify the most transformed and least altered areas. While its analysis is valid as a merely indicative element that might provide a general representation of the impact of the most evident variables on the terrain, it lacks the precision of the LiDAR-generated DEMs. The 1957 DEM has a lower resolution (5 m), which would likely introduce noise if the obtained XGBoost models were applied. Although its results may not be quantitatively precise, they serve as a heuristic approach that allows us to assess and explain the current presence and absence of archaeological sites, offering valuable insights for archaeological research and cultural heritage management.

5. Conclusions

From a technical perspective, we have demonstrated that it is possible to measure and map changes greater than ~0.3 m using the resolution of current LiDAR products and multitemporal statistical analysis. Improvements in the spatial and temporal resolution of observations can undoubtedly enhance models like the one proposed, making them more sensitive to change detection. The model developed at the highest resolution achieves an R² of 0.685, thus reasonably predicting the variance of changes exceeding 0.3 m. The remaining unexplained variance might be attributed to factors related to human activity (such as agricultural land use in areas surrounding the reservoir or variations in soil composition, etc.). It is also important to highlight the specificity of the analysed case, as the model’s effectiveness may vary depending on the context in which it is applied.

In the current scenario—characterised by pronounced variability in climatic conditions and the evolving management of natural resources, especially water—the protection of archaeological heritage has become an urgent priority. Investigating the sites that emerge on the surface is a possibility, but its usefulness is limited by two key aspects. First, we need a realistic approach to the archaeological elements that have disappeared. In this regard, an ML-based model like the ones we propose can serve as a valuable tool. It is important to consider that not all archaeological sites have been preserved and that, with the methodology described in this article, we can approximate the modelling of the areas that have suffered the most transformation by using historical DEMs and multicriteria analysis. Second, the preservation of cultural heritage is a social imperative [92]. In dynamic contexts, such as those involving changes in water resource management, new tools like the one we propose are required. Spain lacks inventories of submerged heritage in reservoirs. This case study reveals both the quantity and preservation state of sites that have remained largely unassessed since the 1960s. Public administrations should take responsibility for implementing these policies and, consequently, for applying similar GIS-based and remote sensing approaches, as advocated by the Valletta Convention [10,93].

Finally, we would like to emphasise that remote sensing data science can become an undeniable ally in decision-making processes amid rapid environmental changes, such as those currently being experienced in water management. However, we stress that the direct, on-the-ground observation of the impacts on archaeological sites remains essential. A comprehensive understanding of site typology—including whether sites consist of dispersed lithic or ceramic materials or more structured settlements—and their chronological context is crucial. Additionally, the variability in construction techniques, usage patterns, and evidence of reoccupation underscores the need for the local validation of remote sensing data to ensure accurate, context-specific assessments.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/rs17071306/s1, Data S1: List of archaeological sites.

Author Contributions

Conceptualization, E.C.-C. and P.B.-R.; methodology, E.C.-C.; validation, E.C.-C.; formal analysis, E.C.-C.; investigation, E.C.-C.; data curation, E.C.-C.; writing—original draft preparation, E.C.-C.; writing—review and editing, E.C.-C. and P.B.-R.; supervision, E.C.-C. and P.B.-R. All authors have read and agreed to the published version of the manuscript.

Funding

This work was commissioned by the General Directorate of Fine Arts and Cultural Heritage of the Ministry of Culture and Sports, through the General Subdirectorate of the Institute of Cultural Heritage of Spain and the General Subdirectorate for the Management and Coordination of Cultural Assets.

Data Availability Statement

The LiDAR data were obtained from the website of the Spanish National Centre for Geographic Information, under the INSPIRE European directive: https://centrodedescargas.cnig.es/CentroDescargas/home (accessed on 13 February 2025). The historical aerial photographs were acquired under a restricted-use licence from the Spanish Air Force. Other institutions have provided data under restricted access. Historical data on the reservoirs were supplied by the Tagus River Basin Authority. Information on the location of archaeological sites was provided by the Junta de Extremadura (regional government) and supplemented by our own research; these data must remain inaccessible for security reasons. The remaining raw data and code are available upon reasonable request.

Acknowledgments

We appreciate the commissioning of this work and the support of the General Directorate of Fine Arts and Cultural Heritage of the Ministry of Culture and Sports, through the General Subdirectorate of the Institute of Cultural Heritage of Spain. The coordinates of archaeological sites were obtained thanks to the Directorate for Cultural Heritage of Extremadura’s regional government. A great amount of data on archaeological sites was provided by Antonio González Cordero. Adara López helped in the curation of the geodatabases from both sources and our surveys.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:

ASPRS	American Society for Photogrammetry and Remote Sensing
GIS	Geographical Information System
IDW	Inverse Distance Weighted
IQR	Interquartile Range
ICP	Iterative Closest Point
LiDAR	Light Detection And Ranging
MAE	Mean Absolute Error
ML	Machine Learning
RMSE	Root Mean Squared Error
SCA	Specific Contributing Area
TRI	Terrain Ruggedness Index
TWI	Topographic Wetness Index
SHAP	SHapley Additive exPlanations
SPI	Stream Power Index
STI	Sediment Transport Index
USAF	United States Air Forces
WLC	Weighted Linear Combination
XGBoost	eXtreme Gradient Boosting

References

Abbass, K.; Qasim, M.Z.; Song, H.; Murshed, M.; Mahmood, H.; Younis, I. A Review of the Global Climate Change Impacts, Adaptation, and Sustainable Mitigation Measures. Environ. Sci. Pollut. Res. 2022, 29, 42539–42559. [Google Scholar] [CrossRef]
Masson-Delmotte, V.; Zhai, P.; Pörtner, H.-O.; Roberts, D.; Skea, J.; Shukla, P.; Pirani, A.; Moufouma-Okia, W.; Péan, C.; Pidcock, S.; et al. Global Warming of 1.5°C. An IPCC Report on the Impacts of Global Warming of 1.5°C above Pre-Industrial Levels and Related Global Greenhouse Emission Pathways, in the Context of Strengthening the Global Response to the Threat of Climate Change, Sustainable Development, and Efforts to Eradicate Poverty; IPCC: Geneva, Switzerland, 2018. [Google Scholar]
Maslin, M. Climate Change: A Very Short Introduction; Oxford Academic: Oxford, UK, 2021. [Google Scholar]
Dessler, A.E. Introduction to Modern Climate Change; Cambridge University Press: Cambridge, UK, 2021. [Google Scholar]
Orr, S.A.; Richards, J.; Fatorić, S. Climate Change and Cultural Heritage: A Systematic Literature Review (2016–2020). Hist. Environ. Policy Pract. 2021, 12, 434–477. [Google Scholar] [CrossRef]
Wright, J.P.; Hylton, M. Exploring Climate Change Adaptations for Cultural Heritage: The ADAPT Framework. Adv. Archaeol. Pract. 2024, 12, 313–321. [Google Scholar] [CrossRef]
Kotova, L.; Leissner, J.; Winkler, M.; Kilian, R.; Bichlmair, S.; Antretter, F.; Moßgraber, J.; Reuter, J.; Hellmund, T.; Matheja, K.; et al. Making Use of Climate Information for Sustainable Preservation of Cultural Heritage: Applications to the KERES Project. Herit. Sci. 2023, 11, 18. [Google Scholar] [CrossRef]
Simpson, N.P.; Orr, S.A.; Sabour, S.; Clarke, J.; Ishizawa, M.; Feener, R.M.; Ballard, C.; Mascarenhas, P.V.; Pinho, P.; Bosson, J.-B.; et al. Impacts, Vulnerability, and Understanding Risks of Climate Change for Culture and Heritage: Contribution of Impacts Group II to the International Co-Sponsored Meeting on Culture, Heritage and Climate Change; ICOMOS International: Paris, France, 2022. [Google Scholar]
Fluck, H.; Guest, K. Climate Change and Archaeology. An Introduction. Internet Archaeology; Usborne Publishing: London, UK, 2022. [Google Scholar] [CrossRef]
De Angeli, S.; Battistin, F. Archaeological Site Monitoring and Risk Assessment Using Remote Sensing Technologies and GIS. In A Research Agenda for Heritage Planning; Edward Elgar Publishing: Cheltenham, UK, 2021. [Google Scholar]
Mencía, J.; Polo, M.-E.; Felicísimo, Á.M.; Daire, M.-Y.; Güimil-Fariña, A.; Mañana-Borrazás, P.; Vilaseco Vázquez, X.I.; López-Romero, E. Comprehensive Workflow for Erosion Monitoring of Coastal Archaeological Sites by Means of Digital Photogrammetry: An Iberian Case Study. Digit. Appl. Archaeol. Cult. Herit. 2024, 35, e00373. [Google Scholar] [CrossRef]
Fitzpatrick, S.M. On the Shoals of Giants: Natural Catastrophes and the Overall Destruction of the Caribbean’s Archaeological Record. J. Coast Conserv. 2012, 16, 173–186. [Google Scholar] [CrossRef]
Fernández, A.F.; Abad, P.V.; Nóvoa, A.A.R. 3D Photogrammetry as a Tool for Studying Erosive Processes at a Roman Coastal Site: The Case of the Roman Fish-Salting Plant at Sobreira (Vigo, Spain). Archaeol. Anthropol. Sci. 2022, 14, 32. [Google Scholar] [CrossRef]
Dawson, T.; Hambly, J.; Kelley, A.; Lees, W.; Miller, S. Coastal Heritage, Global Climate Change, Public Engagement, and Citizen Science. Proc. Natl. Acad. Sci. USA 2020, 117, 8280–8286. [Google Scholar] [CrossRef]
Daire, M.-Y.; Lopez-Romero, E.; Proust, J.-N.; Regnauld, H.; Pian, S.; Shi, B. Coastal Changes and Cultural Heritage (1): Assessment of the Vulnerability of the Coastal Heritage in Western France. J. Isl. Coast. Archaeol. 2012, 7, 168–182. [Google Scholar] [CrossRef]
Erlandson, J.M. Racing a Rising Tide: Global Warming, Rising Seas, and the Erosion of Human History. J. Isl. Coast. Archaeol. 2008, 3, 167–169. [Google Scholar] [CrossRef]
Senf, C.; Seidl, R. Persistent Impacts of the 2018 Drought on Forest Disturbance Regimes in Europe. Biogeosciences 2021, 18, 5223–5230. [Google Scholar] [CrossRef]
Ødegård, R.S.; Nesje, A.; Isaksen, K.; Andreassen, L.M.; Eiken, T.; Schwikowski, M.; Uglietti, C. Climate Change Threatens Archeologically Significant Ice Patches: Insights into Their Age, Internal Structure, Mass Balance and Climate Sensitivity. Cryospher 2017, 11, 17–32. [Google Scholar] [CrossRef]
Hollesen, J.; Callanan, M.; Dawson, T.; Fenger-Nielsen, R.; Friesen, T.M.; Jensen, A.M.; Markham, A.; Martens, V.V.; Pitulko, V.V.; Rockman, M. Climate Change and the Deteriorating Archaeological and Environmental Archives of the Arctic. Antiquity 2018, 92, 573–586. [Google Scholar] [CrossRef]
Hollesen, J.; Matthiesen, H.; Elberling, B. The Impact of Climate Change on an Archaeological Site in the Arctic. Archaeometry 2017, 59, 1175–1189. [Google Scholar] [CrossRef]
Matthiesen, H.; Jensen, J.B.; Gregory, D.; Hollesen, J.; Elberling, B. Degradation of Archaeological Wood Under Freezing and Thawing Conditions—Effects of Permafrost and Climate Change. Archaeometry 2014, 56, 479–495. [Google Scholar] [CrossRef]
Desjardins, S.P.A.; Jordan, P.D. Arctic Archaeology and Climate Change. Annu. Rev. Anthropol. 2019, 48, 279–296. [Google Scholar] [CrossRef]
Naumann, G.; Cammalleri, C.; Mentaschi, L.; Feyen, L. Increased Economic Drought Impacts in Europe with Anthropogenic Warming. Nat. Clim. Change 2021, 11, 485–491. [Google Scholar] [CrossRef]
Cheval, S.; Dumitrescu, A.; Birsan, M.-V. Variability of the Aridity in the South-Eastern Europe over 1961–2050. Catena 2017, 151, 74–86. [Google Scholar] [CrossRef]
Carvalho, D.; Pereira, S.C.; Silva, R.; Rocha, A. Aridity and Desertification in the Mediterranean under EURO-CORDEX Future Climate Change Scenarios. Clim. Change 2022, 174, 28. [Google Scholar] [CrossRef]
Markonis, Y.; Kumar, R.; Hanel, M.; Rakovec, O.; Máca, P.; AghaKouchak, A. The Rise of Compound Warm-Season Droughts in Europe. Sci. Adv. 2021, 7, eabb9668. [Google Scholar] [CrossRef]
Viviroli, D.; Archer, D.R.; Buytaert, W.; Fowler, H.J.; Greenwood, G.B.; Hamlet, A.F.; Huang, Y.; Koboltschnig, G.; Litaor, M.I.; López-Moreno, J.I.; et al. Climate Change and Mountain Water Resources: Overview and Recommendations for Research, Management and Policy. Hydrol. Earth Syst. Sci. 2011, 15, 471–504. [Google Scholar] [CrossRef]
García-Ruiz, J.M.; López-Moreno, J.I.; Vicente-Serrano, S.M.; Lasanta–Martínez, T.; Beguería, S. Mediterranean Water Resources in a Global Change Scenario. Earth Sci. Rev. 2011, 105, 121–139. [Google Scholar] [CrossRef]
Williamson, C.E.; Saros, J.E.; Vincent, W.F.; Smol, J.P. Lakes and Reservoirs as Sentinels, Integrators, and Regulators of Climate Change. Limnol. Oceanogr. 2009, 54, 2273–2282. [Google Scholar] [CrossRef]
Shahady, T. Sustainable Water Management with a Focus on Climate Change. In Water and Climate Change; Elsevier: Amsterdam, The Netherlands, 2022; pp. 293–316. [Google Scholar]
Raje, D.; Mujumdar, P.P. Reservoir Performance under Uncertainty in Hydrologic Impacts of Climate Change. Adv. Water Resour. 2010, 33, 312–326. [Google Scholar] [CrossRef]
Pahl-Wostl, C. Transitions towards Adaptive Management of Water Facing Climate and Global Change. Water Resour. Manag. 2006, 21, 49–62. [Google Scholar] [CrossRef]
Ehsani, N.; Vörösmarty, C.J.; Fekete, B.M.; Stakhiv, E.Z. Reservoir Operations under Climate Change: Storage Capacity Options to Mitigate Risk. J. Hydrol. 2017, 555, 435–446. [Google Scholar] [CrossRef]
Lettenmaier, D.P.; Wood, A.W.; Palmer, R.N.; Wood, E.F.; Stakhiv, E.Z. Water Resources Implications of Global Warming: A U.S. Regional Perspective. Clim. Change 1999, 43, 537–579. [Google Scholar] [CrossRef]
Stakhiv, E.Z. Pragmatic Approaches for Water Management Under Climate Change Uncertainty1. JAWRA J. Am. Water Resour. Assoc. 2011, 47, 1183–1196. [Google Scholar] [CrossRef]
Olcina Cantos, J. Water Planning and Management in Spain in a Climate Change Context: Facts and Proposals. Cuad. Investig. Geográfica 2024, 50, 3–28. [Google Scholar] [CrossRef]
Ware, J.A. Archeological Inundation Studies: Manual for Reservoir Managers; Defense Technical Information Center: Fort Belvoir, VA, USA, 1989. [Google Scholar]
Matamoros Coder, P.; Cerrillo-Cuenca, E. MARPASE: Un Modelo Geoespacial Para Calibrar El Daño Del Patrimonio Arqueológico Subacuático En Embalses. In Proceedings of the Actas del V Congreso Internacional de Arqueología Subacuática (IKUWA V), Subdirección General de Documentación y Publicaciones, Ministerio de Educación Cultura y Deporte, Madrid, Spain, 15–18 October 2014; pp. 460–462. [Google Scholar]
Matamoros Coder, P.; Carrascosa Moliner, B.; Cerrillo Cuenca, E. La Situación Del Patrimonio Arqueológico Subacuático En La Cuenca Extremeña Del Tajo. Perspectivas de Conservación, Documentación y Análisis. In Proceedings of the Arqueología Subacuática Española: Actas del I Congreso de Arqueología Naútica y Subacuática Española, Editorial UCA, Cartagena, Colombia, 14–16 March 2013; pp. 67–80. [Google Scholar]
UNESCO. Climate Change and World Heritage Report on Predicting and Managing the Impacts of Climate Change on World Heritage and Strategy to Assist States Parties to Implement Appropriate Management Responses; Unesco World Heritage Centre: Paris, France, 2007. [Google Scholar]
Titolo, A. Use of Time-Series NDWI to Monitor Emerging Archaeological Sites: Case Studies from Iraqi Artificial Reservoirs. Remote Sens. 2021, 13, 786. [Google Scholar] [CrossRef]
Marchetti, N.; Curci, A.; Gatto, M.C.; Nicolini, S.; Mühl, S.; Zaina, F. A Multi-Scalar Approach for Assessing the Impact of Dams on the Cultural Heritage in the Middle East and North Africa. J. Cult. Herit. 2019, 37, 17–28. [Google Scholar] [CrossRef]
Marchetti, N.; Bitelli, G.; Franci, F.; Zaina, F. Archaeology and Dams in Southeastern Turkey. J. Mediterr. Archaeol. 2020, 33, 29–54. [Google Scholar] [CrossRef]
Cerrillo-Cuenca, E.; de Sanjosé Blasco, J.J.; Belinchón, R.C.; Bueno-Ramírez, P.; Cordero, A.G.; Pérez-Álvarez, J.A. Surveying and Monitoring Submerged Archaeological Sites in Inland Waters through a Multiproxy Strategy: The Case of Dolmen de Guadalperal and Other Sites from Valdecañas Reservoir (Spain). Archaeol. Prospect. 2024, 31, 53–69. [Google Scholar] [CrossRef]
Zarfl, C.; Lucía, A. The Connectivity between Soil Erosion and Sediment Entrapment in Reservoirs. Curr. Opin. Environ. Sci. Health 2018, 5, 53–59. [Google Scholar] [CrossRef]
Dutta, S. Soil Erosion, Sediment Yield and Sedimentation of Reservoir: A Review. Model Earth Syst. Environ. 2016, 2, 123. [Google Scholar] [CrossRef]
Waechter, S.A.; Mikesell, S.D. Research Desing for Prehistoric, Etnographic, and Historic Cultural Resources al Folsom Reservoir California; Sacramento, CA, USA, 1994; Available online: https://core.tdar.org/document/53967/research-design-for-prehistoric-ethnographic-and-historic-cultural-resources-at-folsom-reservoir-california (accessed on 11 January 2025).
The Guardian. Historic Monuments Resurface as Severe Drought Shrinks Spain’s Reservoirs; The Guardian: Manchester, UK, 2022. [Google Scholar]
Lidz, F. With Drought, “Spanish Stonehenge” Emerges Once Again New York Times: New York, NY, USA, 2022.
Science. Sicence News Staff News at a Glance: New Gene Therapy, Europe’s Drought, and a Black Hole’s Photon Ring; Science: Washington, DC, USA, 2022. [Google Scholar]
Chow, D. Drought Exposes Long-Submerged “Spanish Stonehenge” Monument; NBC News: New York, NY, USA, 2019. [Google Scholar]
Maishman, E. In Pictures: Drought in Europe Exposes Sunken Ships, Lost Villages and Ominous “Hunger Stones”. BBC News, 21 August 2022. [Google Scholar]
Leisner, G.; Leisner, V. El Guadalperal. Madr. Mitteilungen 1960, 1, 20–73. [Google Scholar]
Cerrillo-Cuenca, E.; de Sanjosé Blasco, J.J.; Bueno-Ramírez, P.; Pérez-Álvarez, J.A.; de Balbín Behrmann, R.; Sánchez-Fernández, M. Emergent Heritage: The Digital Conservation of Archaeological Sites in Reservoirs and the Case of the Dolmen de Guadalperal (Spain). Herit. Sci. 2021, 9, 114. [Google Scholar] [CrossRef]
Bueno Ramírez, P.; Balbín Behrmann, R. de Grafías y Territorios Megalíticos En Extremadura. In Muita Gente, Poucas Antas. Origens, Espaços e Contextos do Megalitismo. Actas do II Colóquio Internacional Sobre Megalitismo, Reguengos de Monsaraz, 3 a 7 de Maio de 2000; Gonçalves, V.d.S., Ed.; Ministério da Cultura, Instituto Portugués de Arqueología: Lisboa, Portugal, 2003; pp. 407–448. [Google Scholar]
Ministerio para la Transición Ecológica y el Reto Demográfico. Ficha Técnica de La Presa: Valdecañas. Sistema Nacional de Cartografía de Zonas Inundables. Available online: https://sig.mapama.gob.es/WebServices/clientews/snczi/default.aspx?origen=8&nombre=EGISPE_PRESA&claves=ID_INFRAESTRUCTURA&valores=541 (accessed on 11 January 2025).
González Cordero, A.; Morán Sánchez, C.J. Talavera La Vieja y Su Entorno Arqueológico. In El conjunto Orientalizante de Talavera la Vieja (Cáceres); Jiménez Ávila, F.J., Ed.; Junta de Extremadura, Consejería de Cultura: Cáceres, Spain, 2006; pp. 19–44. [Google Scholar]
Parcero-Oubiña, C. At Last! Discovery of Archaeological Features through Aerial Imagery and Lidar at Galician Hillforts. AARGnews 2022, 64, 23–33. [Google Scholar]
Mitasova, H.; Barton, M.; Ullah, I.; Hofierka, J.; Harmon, R.S. 3.9 GIS-Based Soil Erosion Modeling. In Treatise on Geomorphology; Elsevier: Amsterdam, The Netherlands, 2013; pp. 228–258. [Google Scholar]
Cerrillo-Cuenca, E.; Sanjosé, J. Mapping and Interpreting Vanished Archaeological Features Using Historical Aerial Photogrammes and Digital Photogrammetry. In Proceedings of the 38 th Annual Conference on Computer Applications and Quantitative Methods in Archaeology, CAA2010, Granada, Spain, 6–9 April 2010; pp. 43–46. [Google Scholar]
Pérez, J.A.; Bascon, F.M.; Charro, M.C. Photogrammetric Usage of 1956-57 Usaf Aerial Photography of Spain. Photogramm. Rec. 2014, 29, 108–124. [Google Scholar] [CrossRef]
Zhou, Q.-Y.; Park, J.; Koltun, V. Open3D: A Modern Library for 3D Data Processing. arXiv, 2018; arXiv:1801.09847. [Google Scholar]
Lindsay, J.B. Whitebox GAT: A Case Study in Geomorphometric Analysis. Comput. Geosci. 2016, 95, 75–84. [Google Scholar] [CrossRef]
Persad, R.A.; Armenakis, C. Automatic Co-Registration of 3D Multi-Sensor Point Clouds. ISPRS J. Photogramm. Remote Sens. 2017, 130, 162–186. [Google Scholar] [CrossRef]
Koide, K.; Yokozuka, M.; Oishi, S.; Banno, A. Voxelized GICP for Fast and Accurate 3D Point Cloud Registration. In Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, 30 May 2021; pp. 11054–11059. [Google Scholar]
Li, J.; Zhan, J.; Zhou, T.; Bento, V.A.; Wang, Q. Point Cloud Registration and Localization Based on Voxel Plane Features. ISPRS J. Photogramm. Remote Sens. 2022, 188, 363–379. [Google Scholar] [CrossRef]
Xiong, B.; Jiang, W.; Li, D.; Qi, M. Voxel Grid-Based Fast Registration of Terrestrial Point Cloud. Remote Sens. 2021, 13, 1905. [Google Scholar] [CrossRef]
Chen, Y.; Medioni, G. Object Modelling by Registration of Multiple Range Images. Image Vis. Comput. 1992, 10, 145–155. [Google Scholar] [CrossRef]
Besl, P.J.; McKay, N.D. A Method for Registration of 3-D Shapes. IEEE Trans. Pattern Anal. Mach. Intell. 1992, 14, 239–256. [Google Scholar] [CrossRef]
Cheng, L.; Chen, S.; Liu, X.; Xu, H.; Wu, Y.; Li, M.; Chen, Y. Registration of Laser Scanning Point Clouds: A Review. Sensors 2018, 18, 1641. [Google Scholar] [CrossRef]
de Oliveira, E.M.; dos Santos, D.R. Closed-Form Solution to Point- and Plane-Based Co-Registration of Terrestrial LiDAR Point Clouds. Appl. Geomat. 2023, 15, 421–439. [Google Scholar] [CrossRef]
Armenakis, C.; Gao, Y.; Sohn, G. Co-Registration of Aerial Photogrammetric and LiDAR Point Clouds in Urban Environments Using Automatic Plane Correspondence. Appl. Geomat. 2013, 5, 155–166. [Google Scholar] [CrossRef]
Li, J.; Heap, A.D. Spatial Interpolation Methods Applied in the Environmental Sciences: A Review. Environ. Model. Softw. 2014, 53, 173–189. [Google Scholar] [CrossRef]
Horn, B.K.P. Hill Shading and the Reflectance Map. Proc. IEEE 1981, 69, 14–47. [Google Scholar] [CrossRef]
Florinsky, I.V. Digital Terrain Analysis in Soil Science and Geology; Academic Press: Cambridge, MA, USA, 2012. [Google Scholar]
Gallant, J.C.; Wilson, J.P. Primary Topographic Attributes. In Terrain Analysis: Principles and Application; Wilson, J.P., Gallant, J.C., Eds.; Wiley: Hoboken, NJ, USA, 2000; pp. 51–86. [Google Scholar]
Shary, P.A. Land Surface in Gravity Points Classification by a Complete System of Curvatures. Math. Geol. 1995, 27, 373–390. [Google Scholar] [CrossRef]
Riley, S.J.; DeGloria, S.D.; Elliot, R. A Terrain Ruggedness Index That Quantifies Topographic Heterogeneity. Intermt. J. Sci. 1999, 5, 23–27. [Google Scholar]
Beven, K.J.; Kirby, M.J. A Physically Based, Variable Contributing Area Model of Basin Hydrology. Hydrol. Sci. Bull. 1979, 24, 43–69. [Google Scholar] [CrossRef]
Moore, I.D.; Grayson, R.B.; Ladson, A.R. Digital Terrain Modelling: A Review of Hydrological, Geomorphological, and Biological Applications. Hydrol. Process. 1991, 5, 3–30. [Google Scholar] [CrossRef]
O’Callaghan, J.F.; Mark, D.M. The Extraction of Drainage Networks from Digital Elevation Data. Comput. Vis. Graph Image Process. 1984, 28, 323–344. [Google Scholar] [CrossRef]
Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 13 August 2016; pp. 785–794. [Google Scholar]
Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-Learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
Lundberg, S.M.; Lee, S.I. A Unified Approach to Interpreting Model Predictions. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Volume 2017. [Google Scholar]
Dunn, O.J. Multiple Comparisons Among Means. J. Am. Stat. Assoc. 1961, 56, 52. [Google Scholar] [CrossRef]
Thompson, J.A.; Bell, J.C.; Butler, C.A. Digital Elevation Model Resolution: Effects on Terrain Attribute Calculation and Quantitative Soil-Landscape Modeling. Geoderma 2001, 100, 67–89. [Google Scholar] [CrossRef]
Malczewski, J. On the Use of Weighted Linear Combination Method in GIS: Common and Best Practice Approaches. Trans. GIS 2000, 4, 5–22. [Google Scholar] [CrossRef]
García Bellido, A. Excavaciones En Augustrobriga (Talavera La Vieja, Cáceres). Not. Arqueol. Hisp. 1962, 5, 235–237. [Google Scholar]
Cenci, L.; Disperati, L.; Sousa, L.P.; Phillips, M.; Alve, F.L. Geomatics for Integrated Coastal Zone Management: Multitemporal Shoreline Analysis and Future Regional Perspective for the Portuguese Central Region. J. Coast Res. 2013, 165, 1349–1354. [Google Scholar] [CrossRef]
Mitasova, H. Quantifying Rapid Changes in Coastal Topography Using Modern Mapping Techniques and Geographic Information System. Environ. Eng. Geosci. 2004, 10, 1–11. [Google Scholar] [CrossRef]
Chicco, D.; Warrens, M.J.; Jurman, G. The Coefficient of Determination R-Squared Is More Informative than SMAPE, MAE, MAPE, MSE and RMSE in Regression Analysis Evaluation. PeerJ Comput. Sci. 2021, 7, e623. [Google Scholar] [CrossRef] [PubMed]
Kajda, K.; Marx, A.; Wright, H.; Richards, J.; Marciniak, A.; Rossenbach, K.S.; Pawleta, M.; van den Dries, M.H.; Boom, K.; Guermandi, M.P.; et al. Archaeology, Heritage, and Social Value: Public Perspectives on European Archaeology. Eur. J. Archaeol. 2018, 21, 96–117. [Google Scholar] [CrossRef]
Battistin, F.; De Angeli, S. Da CLIMA a RESEARCH. Monitoraggio e Valutazione Del Rischio Nei Siti Archeologici Mediante l’applicazione Di Tecnologie Di Remote Sensing e GIS. In Monitoraggio e Manutenzione delle Aree Archeologiche: CAMBIAMENTI Climatici, Dissesto Idrogeologico, Degrado Chimico-Ambientale: Atti del Convegno Internazionale di Studi, Roma, Curia Iulia, 20–21 Marzo 2019; Russo, A., Della Giovampaola, I., Eds.; L’Erma di Bretschneider: Roma, Italy, 2020; pp. 275–279. [Google Scholar]

Figure 1. View of Dolmen de Guadalperal, a Neolithic collective burial structure built around 4000 BC. The archaeological site is situated at an elevation of 296 m a.s.l., 19 m below the reservoir’s maximum water level, and is accessible only during drought periods.

Figure 2. Location of the Valdecañas Reservoir in the Iberian Peninsula and the distribution of recorded archaeological sites in the flooded area, indicating those mentioned in the text. Dots represent scatters of artefacts, monuments, and cities, described in Supplementary Material Data S1.

Figure 3. Workflow of the proposed methodology.

Figure 4. Seasonal variation in Valdecañas Reservoir elevations. The figure shows the 7-day moving average of reservoir elevations for different decades (1970s, 1980s, 1990s, 2000s, 2010s, and 2020s) over the course of a year.

Figure 5. Seasonal variation in Valdecañas Reservoir elevations. The figure shows the 7-day moving average of reservoir elevations for five-year periods (1995–2019) over the course of a year.

Figure 6. DEM of the Valdecañas Reservoir (Spain) generated from the photogrammetric restitution (structure from motion) of aerial photographs taken by the USAF in 1957.

Figure 7. Scheme of the co-registration of point clouds workflow. The 2010 and 2023 point clouds were co-registered using the point-to-point ICP algorithm, using the 2018 point cloud as a reference.

Figure 10. An example of improved visualisation of multitemporal changes in DEMs. (1–3) Hillshades of 2010, 2018, and 2023 DEMs. (4). An RGB image composed of the Δ of aforementioned DEMs. This procedure enables the identification of major changes by highlighting increases in DEM values, allowing a qualitative and intuitive assessment of topographic changes over time. The topographic elevations below 301 m a.s.l. are shown in blue (covered by the reservoir).

Figure 11. Distribution of considered archaeological sites for the present study in the Valdecañas Reservoir (Spain).

Figure 12. Percentage of raster cells in each archaeological site is categorised as follows: (1) cells prone to analysis (301–315 m a.s.l.), (2) permanently emerged cells (>315 m a.s.l.) and cells not recorded in LiDAR point clouds (<301 m a.s.l.).

Figure 13. Results from XGBoost Model 1B. Feature importance analysis based on “gain”, “weight”, and “coverage”. “Gain” represents the relative improvement in accuracy when a feature is used for splitting. “Weight” indicates the number of times a feature is used in the decision trees. “Coverage” measures the number of instances a feature contributes to when making predictions.

Figure 14. Results from XGBoost Model 2B. Feature importance analysis based on “gain”, “weight”, and “coverage”. “Gain” represents the relative improvement in accuracy when a feature is used for splitting. “Weight” indicates the number of times a feature is used in the decision trees. “Coverage” measures the number of instances a feature contributes to when making predictions.

Figure 15. SHAP summary plot of Model 2B showing the impact of each feature on the model’s output. The x-axis represents the SHAP value, indicating the contribution of each feature to the prediction. Colours denote feature values, with high values in red and low values in blue. Features are ranked by importance from top to bottom.

Figure 16. Stacked bar char representing the categorised information of Model 2B predictions on topographical variations in archaeological sites.

Figure 17. Differences in percentages of predicted categories (Model 2B) in different archaeological sites. (a) Site affected by moderate erosion and minimal sedimentation. (b) Site solely affected by recurrent sedimentary processes of diverse magnitudes. (c,d) Sites affected by similar processes of moderate erosion and sedimentation. For most of the cases (except for (b)), the highest percentages of predictions fall in the “neutral” category of [−0.3, 0.3].

Figure 18. Comparison of elevation values between the DEM derived from photogrammetry (1957) and the DEM obtained from LiDAR data (2018). The left panel shows the histogram of elevation values for both datasets. The middle panel presents a scatter plot of the paired elevation values, with the dashed line representing the 1:1 relationship. The right panel displays the histogram of errors (1957–2018), with the red dashed line indicating the bias (−2.271 m), demonstrating that the 1957 DEM systematically underestimates elevation compared to the 2018 DEM.

Figure 19. Result of the WLC analysis of the selected variables. The raster values are grouped into five value ranges, representing different levels of topographic change. The background of the map corresponds to the 2023 DEM to provide topographic context.

Figure 20. Estimation of changes in the surroundings of Dolmen de Guadalperal (marked by a circle). (a) Orthoimage from 1957 (distributed by the Spanish National Institute for Geography) showing the study area before the reservoir’s construction. (b) Raster of predicted topographic changes based on the WLC analysis.

Figure 21. Estimation of changes in the surroundings of the Roman city of Augustóbriga, under flooded Talavera La Vieja town (encircled). (a) Orthoimage from 1957 (distributed by the Spanish National Institute for Geography) showing the study area before the reservoir’s construction. (b) Raster of predicted topographic changes based on the WLC analysis.

Table 1. Technical specifications of the LiDAR data used for terrain analysis. The data obtained from https://pnoa.ign.es/web/portal/pnoa-lidar/ (accessed on 13 February 2025) refer to general regional estimates.

LiDAR Coverage	Sensor	Point Density (Points/m²)	RMSE xy (m)	RMSE z (m)
1, 2010	Leica ALS50 (Leica Geosystems AG, Heerbrugg, Switzerland)	0.5	0.3	0.4
2, 2018	RIEGL LMS-Q1560 (RIEGL Laser Measurement Systems GmbH, Horn, Austria)	2	0.2	0.15
3, 2023	ATLM Galaxy T2000 Optech (Teledyne Optech, Vaughan, ON, Canada)	5	Not disclosed	Not disclosed

Table 3. Description of the response variable and data distribution intervals for the XGBoost models. Models are categorised as “noise” or “significant” based on whether the absolute differences between DEMs are below or above the RMSE threshold calculated for each interval. Models 1A and 1B correspond to the 2010–2018 period, while Models 2A and 2B correspond to the 2018–2023 period.

Model	Response Variable	Data Distribution Intervals Considered for the Response Variable
Model 1A (noise)	ΔZ_2010–2018 (x, y)	The absolute values of the differences between models are less than the RMSE calculated for the differences between the DEMs: RMSE < \|ΔZ_2018–2023\| (x, y)
Model 1B (significant)	ΔZ_2010–2018 (x, y)	The absolute values of the differences between models are greater than the RMSE calculated for the differences between the DEMs: RMSE < \|ΔZ_2018–2023\| (x, y)
Model 2A (noise)	ΔZ_2018–2023 (x, y)	The absolute values of the differences between models are less than the RMSE calculated for the differences between the DEMs: RMSE < \|ΔZ_2018–2023 (x, y)\| > Q3 + IQR × 1.5
Model 2B (significant)	ΔZ_2018–2023 (x, y)	The absolute values of the differences between models are greater than the RMSE calculated for the differences between the DEMs: RMSE < \|ΔZ_2018–2023 (x, y)\| > Q3 + IQR × 1.5

Table 4. Summary of fitness and RMSE statistics for the co-registration of the 2010 and 2023 files using the 2018 files as the reference. Fitness values are normalised to the range [0, 1], while RMSE values and their standard deviations are expressed in metres.

Dataset	Mean of Fitness [0, 1]	Standard Deviation of Fitness	Mean of Inliers RMSE (m)	Standard Deviation of Inliers RMSE (m)
Co-registration of 2010 files (source 2018 files)	0.81	0.16	0.29	0.01
Co-registration of 2023 files (source 2018 files)	0.86	0.13	0.27	0.01

Table 5. Performance metrics for the XGBoost models analysing DEM differences. The table includes the response variable, the number of samples (n), the coefficient of determination (R²), and the root mean square error (RMSE). Models 1A and 1B correspond to the 2010–2018 interval, while Models 2A and 2B correspond to the 2018–2023 interval, divided into “noise” and “significant” categories.

Model	Response Variable	n (Used for Training)	R²	RMSE (m)
Model 1A (noise)	\|ΔZ_2010–2018 (x, y)\| ≤ 0.3	343,027	0.094	0.066
Model 1B (significant)	\|ΔZ_2010–2018 (x, y)\| ∈ [0.3, 1]	115,365	0.437	0.377
Model 2A (noise)	\|ΔZ_2018–2023 (x, y)\| ≤ 0.3	275,904	0.241	0.064
Model 2B (significant)	\|ΔZ_2018–2023 (x, y)\| ∈ [0.3, 1]	104,490	0.685	0.268

Table 6. Importance and ranking of variables in Models 1B and 2B. The table shows the relative importance of each variable in both models, along with their respective rankings.

Variable	Model 1B Gain	Model 1B Rank	Model 2B Gain	Model 2B Gain
Aspect_cos	0.95	1	1.36	1
Aspect_sin	0.68	2	0.24	7
Flood Frequency	0.66	3	0.87	2
TRI	0.39	4	0.21	8
Profile Curvature	0.36	5	0.19	9
Slope	0.35	6	0.29	3
STI	0.32	7	0.24	6
Planimetric Curvature	0.28	8	0.17	10
Shoreline Position Frequency	0.28	9	0.25	5
SPI	0.28	10	0.19	11
TWI	0.26	11	0.14	12
SCA	0.24	12	0.14	13

Table 7. Normalisation of the importance factor of the four main variables in Model 2B and the sign of their impact on the response variable |ΔZ_2018–2023 (x, y)| ∈ [0.3, 1]. The sign of the effect on the variable has been estimated through the SHAP analysis in Figure 15.

Variable	Importance in Model 2B	Normalised Weight
Cosine of aspect	1.36 (negative effect)	−0.49
Flood Frequency	0.87 (positive effect)	0.32
Slope	0.29 (negative effect)	−0.11
Shoreline Position Frequency	0.25 (positive effect)	0.09

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Cerrillo-Cuenca, E.; Bueno-Ramírez, P. Predictive Archaeological Risk Assessment at Reservoirs with Multitemporal LiDAR and Machine Learning (XGBoost): The Case of Valdecañas Reservoir (Spain). Remote Sens. 2025, 17, 1306. https://doi.org/10.3390/rs17071306

AMA Style

Cerrillo-Cuenca E, Bueno-Ramírez P. Predictive Archaeological Risk Assessment at Reservoirs with Multitemporal LiDAR and Machine Learning (XGBoost): The Case of Valdecañas Reservoir (Spain). Remote Sensing. 2025; 17(7):1306. https://doi.org/10.3390/rs17071306

Chicago/Turabian Style

Cerrillo-Cuenca, Enrique, and Primitiva Bueno-Ramírez. 2025. "Predictive Archaeological Risk Assessment at Reservoirs with Multitemporal LiDAR and Machine Learning (XGBoost): The Case of Valdecañas Reservoir (Spain)" Remote Sensing 17, no. 7: 1306. https://doi.org/10.3390/rs17071306

APA Style

Cerrillo-Cuenca, E., & Bueno-Ramírez, P. (2025). Predictive Archaeological Risk Assessment at Reservoirs with Multitemporal LiDAR and Machine Learning (XGBoost): The Case of Valdecañas Reservoir (Spain). Remote Sensing, 17(7), 1306. https://doi.org/10.3390/rs17071306

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Predictive Archaeological Risk Assessment at Reservoirs with Multitemporal LiDAR and Machine Learning (XGBoost): The Case of Valdecañas Reservoir (Spain)

Abstract

1. Introduction

1.1. Objectives

1.2. The Case of the Valdecañas Reservoir (Cáceres, Spain)

2. Materials and Methods

2.1. Description of the Datasets

2.2. Cleaning and Co-Registration Processes for LiDAR Point Clouds

2.3. Interpolation and Generation of DEMs from LiDAR Point Clouds

2.4. A ML-Based Predictive Model to Calculate Erosive and Sedimentary Processes Caused by the Reservoir’s Impact on Archaeological Sites

2.4.1. Generation of Explanatory Variables

2.4.2. ML-Based (XGBoost) Model: Definition and Training

3. Results

3.1. Co-Registration of Maps and Accuracy Estimation

3.2. Results of the Evaluation of Topographic Changes in Archaeological Sites

3.3. XGBoost Predictive Models

3.4. Assessment of Potential Erosion and Sedimentary Processes on Archaeological Sites

3.5. Modelling on a Historical DEM: A Tentative Approach

4. Discussion

4.1. Overall Limitations of the Proposed Methodology

4.2. The Potential of Quantitative Analysis

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI