*2.5. RTK-GNSS Data Processing*

Mounting a GNSS rover onto a bicycle made for efficient topographic measurements, but, unlike a rover set atop a pole kept static and vertical during data acquisition, measured points' coordinates may well be affected by errors due to pitch and roll whereby the bike is subject to both the local relief in the direction of travel and its own instability. In order to minimize these effects, particularly pitch-related errors, which were easier to detect, survey points were cleaned according to (i) the confidence level with which they were obtained, and (ii) systematic errors due to the uneven terrain topography.

Where survey points were sufficiently close to each other, point confidence was estimated as the standard deviation of elevations within a circular window with 0.2 m radius, which was considered a proxy of the local bed complexity. For the points evaluated, only those with a confidence score below 0.035 m (the theoretical vertical precision of RTK-GNSS) were retained, hence filtering less reliable points due to a locally complex bed morphology and/or unnaturally large deviations among surrounding points.

Pitch-related errors due to the uneven terrain were estimated, assuming the bicycle followed a straight line between two consecutive survey points, that the antenna of the GNSS unit was vertically aligned with the real wheel center point when the bike is horizontal and that the system was well equilibrated at all times (i.e., no roll). Under these conditions, the bed slope in the travel direction, the pitch angle of the bicycle and that of the GNSS antenna are same. We calculated the horizontal (dx) and vertical (dz) point displacements due to a sloping bed using a distance between the two wheels of 1.2 m and a GNSS antenna height of 0.6 m. With the arrangement used, horizontal errors were predominant over vertical errors. As an indication, a 6◦ slope (equivalent to a beach gradient of 1 in 10) resulted in horizontal and vertical errors of approximately 0.062 m and 0.007 m, respectively. Similar to filtering by point confidence, points with pitch-related errors exceeding 0.035 m were systematically filtered to only retain data in permissible terrain (here, terrain with forward slopes below approximately 3◦).

#### *2.6. Measurement Quality Evaluation*

Different strategies and error metrics (Table 3) were used to evaluate measurement quality achieved with photogrammetry. As much as possible, they were used simultaneously to assess results in terms of accuracy, precision and reliability [66]. While accuracy and precision were estimated in comparison with reference data (ground truth) supposed of higher quality, reliability represented the consistency between data obtained using different processing parameters determined through DEMs of differences (DoDs).

Results' accuracy, representing systematic deviations (bias) from the ground truth, and the precision with which they were obtained, were calculated, respectively, as the mean error (ME, Equation (1)) and the standard deviation of error (SDE, Equation (2)) between photogrammetric models and ground targets. RMSE (Equation (3)) is also provided, representing the global precision within results, combining both ME and SDE into a single statistical measure (Equation (4)). Error statistics were calculated along all three dimensions (x, y and z) and were eventually added in quadrature to produce a measure of the 3D error (Equation (5)). For ease of use and clarity of the text, error metrics without a direction

specifically mentioned refer to vertical error, which is of immediate interest for measuring morphological change with DEMs.



y is the measurement or observation, x the reference value and N the number of available comparisons. Horizontal and vertical bars represent the average (i.e., mean) and absolute (i.e., unsigned) values, respectively. X, Y and Z represent easting, northing and vertical (i.e., elevation) directions, respectively.

At La Palue field site only, DEM vertical accuracy and precision were also derived through comparisons with points surveyed with the bike GNSS. This allowed for independent error evaluation at a dense network of ground truth points. For assessing photogrammetric results, each GNSS survey point was compared with the closest DEM grid point, given a maximum distance for comparison of 0.2 m. The latter was decided considering comparisons with points further away would account more for the change in the local bed topography than measurement error itself.

Finally, the availability of four repeat DEMs obtained using the same workflow for collecting (Table 1) and processing (Section 2.3) the data enabled assessing photogrammetric replicability at La Palue. The temporal variability in bed elevations was estimated for each DEM cell using both the standard deviation of elevations and the maximum elevation range between repeat surveys. These metrics were used to assess whether a surface cell (and, hence, the underlying terrain) was stable over the survey period. We used a maximum range of 0.035 m to distinguish stable and unstable cells (cells with elevation range above threshold were considered unstable). To ensure a robust statistical characterization, only cells measured across all four repeat surveys were retained. A multitemporal ground truth DEM was formed by averaging repeat elevations over stable cells. Comparing individual surveys with the ground truth provided information on eventual deviations from the "average topography" of the stable zones.
