Three-Dimensional Deformation Estimation from Multi-Temporal Real-Scene Models for Landslide Monitoring

Xi, Ke; Tao, Pengjie; Niu, Zhuangqun; Zhu, Xiaokun; Duan, Yansong; Ke, Tao; Zhang, Zuxun

doi:10.3390/rs16152705

Open AccessArticle

Three-Dimensional Deformation Estimation from Multi-Temporal Real-Scene Models for Landslide Monitoring

by

Ke Xi

¹,

Pengjie Tao

^1,2,*

,

Zhuangqun Niu

¹,

Xiaokun Zhu

³,

Yansong Duan

^1,2

,

Tao Ke

^1,2 and

Zuxun Zhang

¹

School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China

²

Hubei Luojia Laboratory, Wuhan 430079, China

³

Beijing Institute of Surveying and Mapping, Beijing 100038, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2024, 16(15), 2705; https://doi.org/10.3390/rs16152705

Submission received: 12 June 2024 / Revised: 15 July 2024 / Accepted: 21 July 2024 / Published: 24 July 2024

Download

Browse Figures

Versions Notes

Abstract

This study proposes a three-dimensional (3D) deformation estimation framework based on the integration of shape and texture information for real-scene 3D model matching, effectively addressing the issue of deformation assessment in large-scale geological landslide areas. By extracting and merging the texture and shape features of matched points, correspondences between points in multi-temporal real-scene 3D models are established, resolving the difficulties faced by existing methods in achieving robust and high-precision 3D point matching over landslide areas. To ensure the complete coverage of the geological disaster area while enhancing computational efficiency during deformation estimation, a voxel-based thinning method to generate interest points is proposed. The effectiveness of the proposed method is validated through tests on a dataset from the Lijie north hill geological landslide area in Gansu Province, China. Experimental results demonstrate that the proposed method significantly outperforms existing classic and advanced methods in terms of matching accuracy metrics, and the accuracy of our deformation estimates is close to the actual measurements obtained from GNSS stations, with an average error of only 2.2 cm.

Keywords:

geological landslide monitoring; 3D deformation estimation; UAV; feature extraction; 3D point matching; real-scene 3D model

1. Introduction

Geological disasters are characterized by extensive destructiveness and widespread distribution, posing significant threats to the natural environment, infrastructure, and the safety of human lives and property [1,2,3]. The investigation of landslide susceptibility primarily involves field survey monitoring [4] and physically based modeling methods that incorporate shallow slope stability analysis and relevant material parameters [5,6]. Prior to complete destabilization, many geological disasters experience a sustained period of deformation. Therefore, the precise measurement of deformation in the early stages of geological disasters is crucial for determining whether landslides, hazardous rock masses, and other such phenomena are in a stable condition [7,8,9].

According to the differences in implementation methods, the widely used monitoring technologies can generally be categorized into two types: contact and non-contact techniques. Contact techniques primarily include instruments based on the Global Navigation Satellite System (GNSS), inclinometers, and crack gauges [10,11,12]. These techniques involve the direct installation of sensors and data transmission systems on the surface of the disaster-affected area, enabling the acquisition of local point surface deformation data with millimeter accuracy. However, the monitoring scope and effectiveness of these technologies are limited by the number and distribution of the instruments. Not only are the economic costs high but the spatial resolution is low, providing specific point information only at sampled locations [13,14].

In contrast, non-contact monitoring methods based on remote sensing technology can acquire comprehensive surface information from a distance in disaster-affected areas [15,16,17,18]. However, due to the complex terrain and severe obstructions typical of landslide regions, ground-based remote sensing methods, including total stations [19] and terrestrial laser scanning [20], often struggle to obtain comprehensive and effective observational data. Additionally, satellite monitoring methods face challenges related to long revisit intervals and insufficient resolution [21,22,23]. Although differential InSAR and multi-temporal InSAR methods based on spaceborne SAR satellites are widely used for large-scale geohazard deformation monitoring with high accuracy, they also suffer from calculation errors due to phase decorrelation and insufficient spatial resolution [24,25]. In recent years, highly mobile rotary-wing unmanned aerial vehicles (UAVs) equipped with real-time kinematic (RTK) technology have been employed in landslide surveys [26]. Using photogrammetric techniques, UAVs can achieve a fine reconstruction of landslide areas, thus enabling the more intuitive and safer monitoring of the overall deformation process in these regions [27,28,29].

UAV remote sensing systems surpass satellite remote sensing in terms of data resolution, and they simultaneously offer better terrain adaptability compared to ground-based technologies [30]. Consequently, many researchers have applied UAV photogrammetry to the monitoring of geological landslide [31,32,33]. The methods based on UAVs mainly include two categories: two-dimensional (2D) and three-dimensional (3D) deformation analysis. Two-dimensional deformation analysis primarily involves generating multi-temporal, high-resolution digital elevation models (DEMs) through airborne LiDAR scanning or photogrammetry [34]. Deformations in the vertical direction are calculated by directly comparing two DEMs [35,36]. This approach is straightforward and effective in minimizing the impact of outliers and high surface roughness, resulting in high precision of the calculated vertical deformations. However, when the direction of deformation extends beyond the vertical, this method struggles to accurately calculate the actual situation of the deformation [37].

Three-dimensional deformation analysis methods typically rely on point clouds or three-dimensional meshes. A common practice involves converting a reference point cloud into a 3D mesh and then calculating the minimal distance from points to the mesh to describe 3D deformations [38,39], as exemplified by the cloud-to-cloud comparison [40], cloud-to-mesh comparison, and the multiscale model used to model the cloud comparison methods (M3C2) integrated within the point cloud processing software CloudCompare v2.10 [41]. However, these methods compute the Euclidean distance between reference points and the nearest target points, or between points and a 3D mesh [42,43], without considering whether the two points at the shortest distance truly correspond as homologous points. Therefore, these methods primarily calculate deformations perpendicular to the local surface, and when the direction of deformation is parallel to the surface or involves more complex motions, the results obtained may not accurately reflect the actual displacements [44].

Utilizing UAV photogrammetry enables the rapid and low-cost acquisition of high-precision point clouds over large disaster-affected areas [45]. Therefore, deformation can be calculated by matching corresponding points between mesh point clouds from different time phases using methods based on 3D features. Three-dimensional point matching methods are mainly divided into those based on traditional manual features and those based on deep learning features. Traditional manual features are typically obtained by extracting geometric properties, spatial distributions, or statistical histograms of the 3D shape [46,47]. However, these methods primarily rely on geometric statistical features, such as the spatial distribution of neighboring points and the angles between normals [48,49,50,51]. Given the complex and varied geometric shapes in landslide areas, similar geometric information in different areas can lead to mismatches, such as incorrect point location matching or one-to-many matching situations, making it difficult to directly apply these methods to 3D point matching in geological disaster scenarios such as landslides and hazardous rock masses. On the other hand, most deep learning based methods currently focus on point cloud registration [52,53]. In the point cloud registration process, mismatches of homologous points are not crucial, as the goal is simply to find a correct set of key point matches, which is not suitable for the precise calculation of displacement deformations. Furthermore, deep learning methods are data-dependent, requiring extensive training for specific scenarios. However, current training datasets are largely concentrated on indoor and autonomous driving environments, thus posing challenges in effectively generalizing these methods to complex geological landslide contexts.

This study aims to address identified deficiencies with the objective of establishing point-to-point correspondences between multi-temporal geological landslide areas, thereby accurately estimating deformation to promote the application of UAV photogrammetry in geological landslide monitoring. In the experiment, the proposed method was applied to perform 3D point matching and deformation estimation on the multi-temporal datasets of the Lijie north hill geological landslide area and evaluated the accuracy by using 130 pairs of manually measured checkpoints. The results demonstrate that our method outperforms existing state-of-the-art or classic methods in terms of matching accuracy and stability. Additionally, the absolute accuracy of the proposed method was evaluated by using actual measurements from GNSS stations, showing that the accuracy of our deformation estimates is better than 2.2 cm. The main contributions of this paper include the following aspects:

A 3D deformation estimation framework based on fused shape/texture information of real-world models is proposed, which constructs accurate point-to-point correspondences between multi-temporal mesh point clouds, thus enabling the precise calculation of 3D deformations in disaster-affected areas.
A real-scene 3D model feature descriptor that integrates shape and texture information is introduced, solving the issue of inaccurate 3D point matching in real-world models of landslide areas.
A method for extracting interest points using voxel-based thinning is proposed, which ensures uniform coverage of the entire landslide area while effectively improving computational efficiency.
We conducted a rigorous evaluation of the proposed methods through numerous manual checkpoints and compared them with the state-of-the-art and classical methods, with results demonstrating our approach’s superior performance in terms of quantitative matching accuracy metrics.

The rest of this paper is organized as follows. In Section 2, the details of the 3D deformation estimation framework are explained. Section 3 presents the experiments on real-world geological landslide dataset that demonstrate the advance of our framework numerally and visually. In Section 4, the effect of different features within the framework and the limitations of the method are discussed. Section 5 is the conclusion of this study.

2. Methods

The specific workflow of the proposed framework is depicted in Figure 1. Initially, high-resolution UAV images of the landslide area collected at different times were processed through photogrammetry, and the generated multi-temporal 3D models were registered to maintain coordinate consistency [54]. Subsequently, using the proposed method for extracting interest points using voxel-based thinning, a set of interest points uniformly covering the entire geological landslide area was extracted from the previous time phase 3D mesh point cloud. Next, based on the real-scene 3D model feature descriptor that integrates shape/texture information, the 3D homologous point matching of all interest points was conducted in the subsequent time phase 3D mesh point cloud. Finally, the 3D deformation was calculated based on the matched set of homologous points.

2.1. Voxel-Based Interest Points Extraction

In the conventional process of homologous point matching, feature detection is typically used firstly to identify feature points, including corners, edges, or areas with prominent textures, which are then described and matched. However, for geological landslide scenarios, since the areas where deformation occurs are unknown, traditional feature detection methods often fail to comprehensively cover all potential deformation areas. Therefore, based on 3D mesh point clouds, a method for extracting interest points using voxel-based thinning is proposed. It is important to emphasize that these interest points are extracted from the previous temporal mesh point cloud and are used for 3D point matching. By performing 3D point matching and deformation calculation on these interest points, the overall deformation of the landslide can be reflected based on the deformation results of these points.

Firstly, mesh point clouds were obtained from the reconstructed 3D real-scene models. Due to the highly regular flight paths of the UAVs, the density of the generated mesh point clouds is consistent across different areas. Moreover, since the shooting distances the UAV images are known, the resolutions of the obtained images can be easily calculated. The density of the generated point clouds can thus be estimated, and the voxel size can be determined accordingly. As shown in Figure 2, by utilizing a voxel-based approach [55], the point cloud was divided into multiple cubic regions and retained the point closest to the center in each cubic region, thereby uniformly thinning the point cloud. After completing the initial thinning, we increased the size of the voxels and then proceeded to the next round of thinning. Through this method, interest points of the disaster area that were thinned at multiple levels were obtained.

This voxel-based multilevel thinning method allows for the convenient selection of all points at a specific level as an interest point set. This method offers two major advantages: first, it significantly reduces the required number of interest points compared to the original mesh point cloud, which markedly enhances computational efficiency; second, the extracted interest point set can uniformly cover the entire geological disaster area, ensuring no potential deformation areas are missed while preserving the structure and shape characteristics of the point cloud data as much as possible. Moreover, once a deformation area is identified, point clouds from other levels, which possess higher point densities, can be utilized to gather additional points surrounding that area, enabling a more comprehensive analysis of the deformation.

2.2. Texture Feature Descriptor

Real-scene 3D reconstruction technology has gradually matured, and is capable of providing comprehensive information, including geometric shapes and image textures. In this context, 3D meshes represent the complete 3D geometric information of the reconstructed object, while textures are used to convey radiometric information.

For typical geological disaster scenarios, such as landslides or hazardous rock masses, where the terrain inclination is substantial, orthoimages generated by traditional methods (i.e., projecting onto a horizontal plane) may incur computational errors due to the compression of complex-shaped targets. To express the geometric structural information of complex-shaped targets more accurately, a method of local 2D mapping is employed for orthogonal projection around areas of interest point, creating local 2D images and thereby achieving the dimensional reduction of the 3D model. Since the projection surface of the local 2D images is the optimal spatial plane directly facing the ground scene (such as a cross-section of the landslide surface), it can minimize the projection distortion and enhances the representation of ground information. The process of generating texture feature descriptors from local 2D images is illustrated in Figure 3.

Firstly, an appropriate plane for the area around the interest points is determined by fitting a plane to the surrounding mesh point cloud, thereby establishing the normal vector

\overset{⇀}{N} (a, b, c)

of the spatial projection plane

(a^{2} + b^{2} + c^{2} = 1)

, and then the rotation matrix

R

from the object coordinate system

(O, X, Y, Z)

to the projected plane coordinate system

(O_{e}, X_{e}, Y_{e}, Z_{e})

is calculated by

R = {(\overset{⇀}{X_{e}}, \overset{⇀}{Y_{e}}, \overset{⇀}{Z_{e}})}^{T} = [\begin{matrix} - b / \sqrt{a^{2} + b^{2}} & a / \sqrt{a^{2} + b^{2}} & 0 \\ - a c / \sqrt{a^{2} + b^{2}} & - b c / \sqrt{a^{2} + b^{2}} & \sqrt{a^{2} + b^{2}} \\ a & b & c \end{matrix}]

(1)

Next, by calculating the coordinates of each vertex of the bounding box of the surrounding area, the starting point coordinates

(X_{s}, Y_{s}, Z_{s})

are determined. The transformation relationship between any point in the area around the interest points, from coordinates

(X, Y, Z)

in the original coordinate system

(O, X, Y, Z)

to coordinates (X′, Y′, Z′) in the projected plane coordinate system

(O_{e}, X_{e}, Y_{e}, Z_{e})

, is expressed as:

[\begin{matrix} X^{'} \\ Y^{'} \\ Z^{'} \end{matrix}] = R ([\begin{matrix} X \\ Y \\ Z \end{matrix}] - [\begin{matrix} X_{s} \\ Y_{s} \\ Z_{s} \end{matrix}])

(2)

Finally, each triangular facet in the area surrounding the interest points is transformed into the local coordinate system and rasterized to produce a local 2D image of the area. After obtaining the local 2D image, the Histogram of Oriented Gradients (HOG) feature descriptor is adopted to extract features from the neighborhood pixels of the interest point in the 2D image, resulting in a feature vector corresponding to the pixel. Then, for the mesh of the subsequent time phase model, a local 2D image of equal width and length

R_{t}

is also constructed centered around the interest point, and the HOG features are extracted from each pixel of this image, excluding those at the boundaries. Finally, feature matching is conducted, resulting in a set of correlation coefficients related to the interest point.

The point clouds used in this study are from the mesh of real-scene models, where the texture information for each triangular facet comes directly from the original images captured by UAVs, thus the resolution is consistent with the original images. In the process of performing local 2D mapping, interpolating from the original images to obtain the local 2D images ensures that the resolution of the local 2D images remains consistent with the original images, thereby enabling more precise point matching. During the matching of subsequent temporal data, to ensure that the sets of texture descriptors and shape descriptors correspond and remain consistent, thereby facilitating a more effective integration of these two types of descriptors, all pixels that have generated feature vectors in the 2D images are elevated to the 3D coordinate system. This elevation process allows us to obtain a candidate set of points for the shape descriptors.

2.3. Shape Feature Descriptor

Due to the long intervals between multi-temporal UAV data collections, variations in lighting conditions and seasonal differences among different datasets may lead to significant changes in the texture information of the same physical features, thus causing considerable errors in texture-based matching. To overcome this challenge, in addition to using texture information descriptors, a shape feature descriptor is also introduced for matching. A robust and resilient local reference frame (LRF) is constructed using a distance-weighted method for neighboring points to reduce the sensitivity to occlusions and noise points during the matching process. Furthermore, three different attributes are utilized to encode the position and orientation of each candidate point and integrate these details through histograms to minimize the error in homologous point matching as much as possible. Figure 4 illustrates the construction process of the geometric feature descriptors.

For the candidate point set obtained in Section 2.2, centered around point

p

, all points are collected within a radius

R

to construct a LRF system. The specific calculation is as follows: First, within a spherical region with a support radius

R

centered at

p

, all points in this region are defined as the spherical neighborhood points of

p

, collectively forming a 3D local surface:

Q = {q_{i} : ‖q_{i} - p‖ < R}

(3)

To enhance robustness against noise and occlusions, a distance-weighted method is employed for nearby points, assigning greater weight to points closer to

p

. Thus, the covariance matrix

M_{p}

is represented as a weighted linear combination, expressed as:

M_{p} = \frac{1}{\sum_{q_{i} \in Q} ω_{i}^{(D)}} \sum_{q_{i} \in Q} ω_{i}^{(D)} (q_{i} - p) {(q_{i} - p)}^{T}

(4)

where

ω_{i}^{(D)}

represents the distance weight for the neighboring point

q_{i}

, normalized by the support radius

R

, with the calculation formula:

ω_{i}^{(D)} = 1 - ‖q_{i} - p‖ / R

(5)

By performing an eigenvalue decomposition on

M_{p}

, and sorting the resulting eigenvectors by the magnitude of the eigenvalues, the eigenvector corresponding to the largest eigenvalue is defined as the Z-axis, the second largest as the X-axis, and the smallest as the Y-axis.

After establishing the LRF, to eliminate the effects of rigid body transformations, the point set

Q

is transformed into the LRF centered at point

p

. As shown in Figure 4b, based on the transformed local surface data, the space is first divided into various local subspaces along the radial direction and then compute three geometric attributes within each subspace. Figure 4c displays the schematic diagram of the local height for point

q_{k}

, which is calculated as follows:

{l h}_{k} = R + L R F (p) \cdot p q_{k}

(6)

where

{l h}_{k}

represents the local height of point

q_{k}

, and

p q_{k}

represents the vector from

p

to

q_{k}

.

L R F (p)

represents the axes of the local reference coordinate system at point

p

, and the range of

{l h}_{k}

is [0, 2

R

].

For neighboring point

q_{k}

, the definitions of the two angles

α_{k}

and

β_{k}

are as shown in Figure 4d, and they are calculated as follows:

α_{k} = a r c c o s (L R F (p) \cdot n (q_{k}))

(7)

β_{k} = a r c c o s (\frac{p q_{k} \cdot n (q_{k})}{‖p q_{k}‖})

(8)

where

n (q_{k})

represents the normal of

q_{k}

. The ranges for both

α_{k}

and

β_{k}

are [0, π].

After computing the three geometric attributes, the values of

{l h}_{k}

,

α_{k}

, and

β_{k}

are statistically analyzed to create histograms for each subspace, as shown in Figure 4e. Subsequently, histograms from all subspaces are normalized and concatenated, as illustrated in Figure 4f, to generate the geometric descriptor vector for the candidate point

p

. The correlation coefficients for the candidate point set are then obtained by calculating the correlation between these histograms.

2.4. Features Integration

After calculating the correlation using both texture feature descriptors and shape feature descriptors, the correlation coefficient results

r_{t}

and

r_{g}

for candidate points are obtained. Subsequently, a weighted average method is employed to determine the final homologous point:

r (s) = m a x \{(w_{t} {\cdot r}_{t_{s}} + w_{g} \cdot r_{g_{s}}) | s \in S\}

(9)

where

w_{t}

and

w_{g}

are the weights assigned to the texture feature and shape feature, respectively, determined based on factors such as the collection times of the multi-temporal data and the shooting environment.

S

represents the set of candidate points, and

r (s)

denotes the highest weighted value among all candidate points, then the corresponding point is identified as the final homologous point.

3. Experiments and Results

3.1. Experiment Data

The performance of the proposed method for estimating 3D deformation was tested by applying it in the Lijie north hill landslide area, as shown in Figure 5a. The Lijie north hill landslide is located in Lijie Town, Zhouqu County, Gansu Province, China. Due to regional tectonic activity and river erosion, the area features deeply incised valleys and steep mountains, with frequent geological disasters. The study area is approximately 300 m in length, 400 m in width, and has an elevation difference of nearly 300 m. A typical alluvial fan forms at the mouth of the gully, and some residential buildings of Lijie Town are constructed on these debris flow alluvial fans.

In this experiment, a DJI M300 RTK drone and DJI Zenmuse P1 camera were utilized to capture images over the Lijie north hill landslide area. UAV images of two temporal phases were acquired in March and September 2022, respectively. The parameters for both flights were kept consistent, with a flight altitude of 100 m, a forward overlap of 80%, and a side overlap of 50%. After data collection, the high-resolution images were processed to obtain the real-scene 3D model. It should be noted that in the process of multi-temporal model registration, we first manually selected some points from stable areas (i.e., areas without deformation) based on the results of the first period of aerial triangulation. These points were then used as control points for the aerial triangulation of the second period images by using the method proposed in [54], thereby achieving the registration of the models from the two periods.

For general geological disaster deformation areas, monitoring using existing methods, such as GNSS base stations, typically only allows displacement monitoring at a few key points and does not cover most of the disaster area. Using such data, on one hand, it is difficult to conduct quantitative analysis of the areas outside the key points, and on the other hand, the scope is not comprehensive enough. Therefore, to meet the needs for detailed quantitative analysis of large deformation areas, we utilized the multi-temporal 3D data from the Lijie north hill landslide terrain to manually collect 130 pairs of homonymous points, uniformly distributed on two phases of 3D real-world models as manual checkpoints. These checkpoints were used as ground truth for quantitative calculations and data validation of different methods, as well as for performance evaluation. The specific distribution of the manual checkpoints is illustrated in Figure 5c.

The detailed displacement parameters of the manually measured checkpoints are shown in Figure 6. It can be observed that the displacement range is concentrated within 2 m. Additionally, the displacement variations among different checkpoints differ, thereby enhancing the diversity of the test.

3.2. Evaluation Metrics

This paper presents a method for estimating 3D deformations by matching real-scene 3D models using an integration of shape and texture mapping information. Three-dimensional point matching is the core aspect of deformation estimation, and the matching performance is primarily evaluated using the following three metrics:

Root mean square error (RMSE) of matching residual errors in XYZ directions: For the matched homologous point pairs, the RMSEs of matching residual errors in the XYZ directions are calculated for quantitative evaluation. For example, the calculation formula for the X direction is as follows:

{R M S E}_{X} = \sqrt{\frac{\sum_{i = 1}^{n} {(X_{i} - {\hat{X}}_{i})}^{2}}{n}}

(10)

where

n

is the total number of matched points,

X_{i}

is the X coordinate of the

i_{t h}

matched point, and

{\hat{X}}_{i}

is the X coordinate of the corresponding checkpoint.

Standard deviation

σ

of matching residual errors in XYZ directions: To evaluate the robustness of different methods in matching 3D homologous points, the standard deviations of the matching residual errors in the XYZ directions are calculated for all matched homologous point pairs.

Correct matching rate CMR: CMR refers to the rate of correctly matched point pairs to all point pairs. If the absolute value of the matching error in any XYZ direction (i.e., the difference between the deformation values calculated from automatically matched homologous points and those obtained through manually measured checkpoints) exceeds 10 cm, the point pair is considered incorrectly matched [53,56].

Additionally, in the comparative experiments of the interest point extraction, the evaluation is based on the number of extracted interest points, the extraction time, and the displacement point coverage. Displacement point coverage refers to the proportion of manual checkpoints covered by the extracted interest points. A manual checkpoint is considered covered if an extracted interest point is within a 2 m radius of the checkpoint.

3.3. Implementation Details

All experiments were conducted on a computer equipped with an Intel (R) Core (TM) i7-6700HQ 2.60 GHz CPU and 64 GB RAM. In the interest point extraction experiments, classic methods such as SIFT3D, Harris3D, and ISS3D were selected as baselines. For the 3D homologous point matching experiments, five representative 3D homologous point geometric descriptor methods were chosen for comparison, including 3DSC [47], SHOT [50], FPFH [49], RoPS [48], and SpinNet [53]. Among these, 3DSC, SHOT, FPFH, and RoPS are widely cited classic local point cloud feature descriptors, which have demonstrated superior performance across multiple public datasets. SpinNet, a 3D feature descriptor based on deep learning networks, has recently achieved leading results on both indoor and outdoor public datasets.

In the interest point extraction comparison experiments, the parameters for the SIFT3D, Harris3D, and ISS3D methods were set to conventional values. In SIFT3D, six octaves are defined, with eight scale layers generated under each octave, and the contrast threshold is set at 0.01. The corner response threshold for Harris3D is set at 0.02. For ISS3D, the neighborhood radius is 0.2, the minimum eigenvalue threshold is 0.975, the corner response threshold is 0.24, and the non-maximum suppression radius is 0.16. In our proposed method, the initial voxel size is 0.1, with each subsequent layer doubling in voxel size, and the entire point cloud is thinned into seven layers.

In the 3D point matching comparative experiments, for our method in the texture descriptors,

R_{t}

is set to 2 m, the number of orientations for HOG features is set to nine, the cell size is set to 8 m, and the block size is set to 2 m. In the shape descriptors, the support radius

R

for establishing the LRF is set to 2 m, and it is divided into four subspaces radially. Additionally, the weights for the texture and geometric descriptors,

w_{t}

and

w_{g}

, are set to 0.7 and 0.3, respectively. For the methods 3DSC, SHOT, FPFH, and RoPS, the support radius is uniformly set to 3.0 m. For RoPS, the number of rotations and the number of histogram partitions are set to 3 and 5, respectively, and for 3DSC, SHOT, and FPFH, the normal vectors are calculated using 50 nearest neighbor points. In the SpinNet network, the descriptor radius is 1.0 m, and the number of partitions for radial, azimuthal, and elevation directions are 9, 80, and 40, respectively, with a voxel radius and the number of sampling points being 0.5 and 30, respectively. It is worth emphasizing that since our method utilizes geographic coordinate constraints, to ensure a fair comparison in the experiments, other methods also search for corresponding points within a 4 m radius of the interest points during the 3D point matching process.

3.4. Comparative Experiment on Interest Points Extraction

When calculating deformations in geological disaster scenarios, the exact locations of deformations are unknown, necessitating the comprehensive consideration of every area within the scene. This requires the uniform distribution of interest points across the entire scene. Figure 7 presents the results of interest point extraction by other baseline methods and the proposed method, demonstrating that the proposed method extracts more evenly distributed interest points across the entire geological landslide test area and ensures more comprehensive coverage. In contrast, the points extracted by the other methods covered partial areas and did not achieve the complete coverage of the entire region. This may be due to the limited number of points with distinct geometric features within the entire geohazard area, which are primarily concentrated in certain regions.

Table 1 provides a quantitative analysis of the extracted interest points, where on the key metric of displacement point coverage, the proposed method achieves comprehensive coverage of all points in the shortest time, while the other three methods cover only a small number, failing to meet the needs for deformation calculations in large-scale geological disaster scenarios. Compared to other representative methods, the proposed method achieves the most interest points in the least amount of time, indicating an improvement in computational efficiency.

3.5. Three-Dimensional Point Matching Comparative Experiment

To quantitatively analyze the matching accuracy of the proposed method compared to other baselines, this section outlines the experiments and evaluations conducted on the Lijie north hill landslide area dataset, based on the multiple metrics described in Section 3.2. The evaluation results are presented in Table 2. It should be noted that, before calculating the RMSE, standard deviation and CMR for each method, in order to prevent gross errors from negatively affecting the results, this study eliminates outliers. The remaining homologous point pairs are then used to calculate the RMSE, standard deviation and CMR metric value.

As shown in Table 2, all methods completed the 3D point matching process for the test area. In key performance metrics such as the RMSE, the standard deviation of matching errors and CMR in the XYZ directions, the proposed method exhibited superior performance, demonstrating greater accuracy and robustness in estimating 3D deformations in the landslide area compared to other baseline methods. This is because the proposed method combines texture features with shape features for 3D point matching. The inclusion of texture features allows the matching accuracy to reach pixel level, which is difficult to achieve with methods that only use shape features.

In terms of manual feature descriptors, the 3DSC, SHOT, and FPFH methods showed less than ideal values for the RMSE, standard deviation and CMR of matched 3D homologous points, indicating numerous mismatches during the matching process. This may be due to the complex shapes of the landslide area, where manually defined descriptor features struggle to adapt to such complexity. Additionally, the presence of planar point clouds with extremely similar geometric structures in the test area made it difficult for traditional manual features to extract effective shape information, also leading to mismatches. Furthermore, with a six-month interval between the two collections, the growth of weeds in localized areas could also be a factor contributing to matching errors. In contrast, the RoPS method considers multiple projections of point clouds, providing rich information about local shapes, effectively capturing, and describing complex geometric structures. When dealing with the point clouds of geological landslide areas with complex geometric features, RoPS is able to effectively distinguish different local shapes and structures, thus significantly outperforming the other three manual feature descriptors in terms of the RMSE, standard deviation, and CMR metrics. The SpinNet method did not fully realize the potential of deep learning networks under geological landslide scenarios, possibly due to significant differences in features between the data used for training the network and the data present in geological landslide scenarios, causing some difficulties in generalizing this method directly to such scenarios. If there are training data under geological landslide scenarios and the network training is improved by fine-tuning process (such as descriptor radius, the number of partitions for radial, azimuthal, and elevation directions), enabling the SpinNet network to better adapt to geological landslide scenarios, then its performance should be improved.

Figure 8 displays the visualization results of 3D homologous point matching in the Lijie north hill test area. In the visualization, green lines represent correctly matched point pairs, while red lines indicate incorrectly matched point pairs. Given the inherent difficulty of 3D homologous point matching, a matched point pair is considered incorrect if the absolute value of the matching error in any of the XYZ directions exceeds 10 cm [53].

From Figure 8, it can be seen that the majority of the matched point pairs obtained by the proposed method are represented by green lines, with only a few inconspicuous red lines, indicating that the method’s accuracy in 3D homologous point matching is very high, thus meeting the requirements for precise 3D deformation estimation in large geological landslide areas. In contrast, the results from 3DSC, SHOT, and FPFH methods show a significantly larger number of red lines, indicating numerous mismatches and higher error values. The RoPS method, however, produces fewer red lines compared to the other manual descriptor methods, demonstrating its advantage in capturing the complex geometric information of the geohazard area. Although the SpinNet method, which utilizes deep learning-based feature descriptors, theoretically has superior capabilities for extracting geometric information, the complex and variable geometric features in the test area differ significantly from the training data; thus, it does not demonstrate a clear advantage in 3D point matching, and the number of red lines is also relatively high.

Figure 9 displays the results of point matching in a local area of the Lijie north hill using the proposed method and baseline methods. A higher overlap between the yellow and red points indicates greater accuracy in point matching process. It can be seen that the points matched by the proposed method almost completely coincide with the ground truth points, demonstrating the high accuracy of the proposed method. Similarly, the RoPS method also exhibits relatively high accuracy, while the SpinNet, SHOT, and FPFH methods show a reduced overlap between the matched points and the ground truth points, indicating lower matching accuracy compared to the proposed and RoPS methods. The homologous points matched by the 3DSC method are at a greater distance from the ground truth points, indicating poorer accuracy in point matching process.

Figure 10 displays the heat maps of the distribution of residual errors for 3D homologous point matching across the XYZ directions, using different methods. The horizontal axis represents the magnitude of the matching residuals, while different colors indicate the percentage of matched points within that residual range relative to the total number of matches.

It is evident that the residuals obtained by the proposed method are predominantly concentrated within the 0–0.02 range, showing a distribution that is significantly better than that of other baseline methods. The residuals distributions for the manual feature descriptor methods 3DSC, SHOT, and FPFH are less favorable, with most concentrated between 0.08 and 0.2, and residuals also spread across other intervals. This reflects the suboptimal robustness of these methods in 3D matching, due to the complex geometric structures of the terrain in the test area, coupled with some geometrically nondescript planar areas, leading to unstable matching performance. In some areas, these methods may adapt well, while in others, they struggle to capture the correct feature information, resulting in larger matching residuals. The RoPS method shows a notably better overall residual distribution in the XYZ directions compared to other manual feature descriptor methods, indicating higher accuracy in 3D matching. Limited by the geometric feature extraction capabilities not generalizing well to geological landslide scenarios, the residual distribution for the SpinNet method is less than ideal.

3.6. Comparison with GNSS-Measured Values

To validate the effectiveness of the proposed method, the deformation measurements of the proposed method were compared with the measurements from the GNSS monitoring points distributed in the multi-temporal Lijie north hill landslide area, which are regarded as ground truth. The locations of these GNSS sites are shown in Figure 11.

By selecting interest points at each of the five GNSS sites and performing 3D homologous point matching, the displacement distances for these points were then calculated based on the matching results and compared with the measurements from the GNSS stations. The results are shown in Table 3.

The results indicate the deformation estimations of the proposed method are closely aligned with the GNSS measurements, with a RMSE of 2.26 cm, and the monitoring errors in the XYZ directions are mostly less than 2 cm. This demonstrates that centimeter-level regional deformations can be distinguished by using the proposed method.

4. Discussion

4.1. Effectiveness of Integrating Shape and Texture Features

To illustrate the effectiveness of integrating texture and shape features, the experimental data from the Lijie north hill are continuously employed as a basis for estimating deformation through three distinct methodologies: employing only texture features, only shape features, and the integration of both texture and shape features. The experimental results are presented in Table 4.

It is apparent that all the three methods are capable of effectively estimating deformations in geological landslide areas. The method that utilizes only shape features exhibited inferior performance relative to the other two methods. This discrepancy may be attributed to the presence of overly complex geometric features in some parts of the geological landslide region, while others have remarkably similar geometric characteristics, making 3D point matching more challenging in these areas. This also indicates that relying solely on shape features for large-scale deformation estimation is quite challenging. On the other hand, the method that relies solely on texture features outperformed the shape-only approach, benefiting from the advantages of using localized 2D mapping and the high accuracy of HOG matching. The best performance was achieved by the method that integrates both shape and texture features, highlighting the necessity of integrating these two types of features when estimating deformations. This integrated approach is also employed in this study.

4.2. Limitations

The proposed method also has some limitations: Firstly, it depends on multi-temporal real-scene 3D model data; consequently, low-resolution UAV imagery or poor-quality 3D reconstruction can adversely affect the accuracy of the deformation estimation. Additionally, this method primarily targets landslide disaster areas before complete destabilization, monitoring a continuous deformation period within that area. Therefore, in instances where a geological disaster area undergoes destabilization, resulting in large-scale deformations (e.g., reaching tens of meters) or when deformations cause significant changes in geometric and texture features, the method may struggle to accurately estimate such deformations. Additionally, in landslide areas with dense vegetation or regions lacking texture information and having uniform geometric features, it is difficult to perform accurate 3D point matching, making deformation estimation challenging in these areas. Therefore, this implies that the proposed method is more of a supplement to landslide monitoring techniques.

5. Conclusions

This study proposed a framework for estimating 3D deformations based on the integration of shape and texture information in real-scene model matching. Accurate correspondences between multi-temporal real-scene 3D models were established, addressing the issues of limited analytical scale, constrained calculation directions, and insufficient accuracy encountered in existing methods for estimating geological deformations. To achieve accurate, robust, and reliable 3D deformation estimation in geological disaster areas, feature descriptors for real-scene 3D models based on texture and shape information were designed and integrated them to accurately match 3D points in real-scene models and precisely estimate deformations. Quantitative evaluations were conducted on a dataset from the real-world Lijie north hill geological landslide area, and our method was compared with other classic or leading methods, confirming the practicality and reliability of our approach. The proposed method, in addition to being applicable to landslide deformation estimation, can also be used in other fields requiring deformation assessment, such as monitoring hazardous rock bodies, building structures, and so on. Future research could involve unattended UAV geological disaster monitoring using new equipment such as UAV airfields, as well as the potential to enhance work efficiency through real-time computations using UAV-mounted computing devices.

Author Contributions

Conceptualization, P.T. and Z.Z.; Data curation, K.X., X.Z. and Y.D.; Formal analysis, K.X. and Z.N.; Funding acquisition, P.T., Y.D. and T.K.; Investigation, K.X., X.Z., Y.D. and T.K.; Methodology, K.X. and P.T.; Project administration, P.T. and Z.Z.; Resources, X.Z. and Y.D.; Software, K.X. and Z.N.; Supervision, P.T. and Z.Z.; Validation, K.X., Z.N. and X.Z.; Visualization, K.X. and Z.N.; Writing—original draft, K.X., P.T. and Z.N.; Writing—review and editing, P.T., X.Z., Y.D. and T.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China, grant number 41801390, and the National Key Research and Development Program, grant number 2019YFC1509604.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Acknowledgments

We would like to express our gratitude to Wuhan DPCloud Company for the UAV images acquisition and processing.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Casagli, N.; Intrieri, E.; Tofani, V.; Gigli, G.; Raspini, F. Landslide Detection, Monitoring and Prediction with Remote-Sensing Techniques. Nat. Rev. Earth Environ. 2023, 4, 51–64. [Google Scholar] [CrossRef]
Xu, Q.; Zhao, B.; Dai, K.; Dong, X.; Li, W.; Zhu, X.; Yang, Y.; Xiao, X.; Wang, X.; Huang, J.; et al. Remote Sensing for Landslide Investigations: A Progress Report from China. Eng. Geol. 2023, 321, 107156. [Google Scholar] [CrossRef]
Zeng, T.; Wu, L.; Peduto, D.; Glade, T.; Hayakawa, Y.S.; Yin, K. Ensemble Learning Framework for Landslide Susceptibility Mapping: Different Basic Classifier and Ensemble Strategy. Geosci. Front. 2023, 14, 101645. [Google Scholar] [CrossRef]
Huang, F.; Xiong, H.; Jiang, S.-H.; Yao, C.; Fan, X.; Catani, F.; Chang, Z.; Zhou, X.; Huang, J.; Liu, K. Modelling Landslide Susceptibility Prediction: A Review and Construction of Semi-Supervised Imbalanced Theory. Earth-Sci. Rev. 2024, 250, 104700. [Google Scholar] [CrossRef]
Cui, H.; Ji, J.; Hürlimann, M.; Medina, V. Probabilistic and Physically-Based Modelling of Rainfall-Induced Landslide Susceptibility Using Integrated GIS-FORM Algorithm. Landslides 2024, 21, 1461–1481. [Google Scholar] [CrossRef]
Ji, J.; Cui, H.; Zhang, T.; Song, J.; Gao, Y. A GIS-Based Tool for Probabilistic Physical Modelling and Prediction of Landslides: GIS-FORM Landslide Susceptibility Analysis in Seismic Areas. Landslides 2022, 19, 2213–2231. [Google Scholar] [CrossRef]
Chang, Z.; Huang, F.; Huang, J.; Jiang, S.-H.; Liu, Y.; Meena, S.R.; Catani, F. An Updating of Landslide Susceptibility Prediction from the Perspective of Space and Time. Geosci. Front. 2023, 14, 101619. [Google Scholar] [CrossRef]
Yang, H.; Song, K.; Chen, L.; Qu, L. Hysteresis Effect and Seasonal Step-like Creep Deformation of the Jiuxianping Landslide in the Three Gorges Reservoir Region. Eng. Geol. 2023, 317, 107089. [Google Scholar] [CrossRef]
Zhang, Q.; Wang, T. Deep Learning for Exploring Landslides with Remote Sensing and Geo-Environmental Data: Frameworks, Progress, Challenges, and Opportunities. Remote Sens. 2024, 16, 1344. [Google Scholar] [CrossRef]
Calcaterra, S.; Cesi, C.; Di Maio, C.; Gambino, P.; Merli, K.; Vallario, M.; Vassallo, R. Surface Displacements of Two Landslides Evaluated by GPS and Inclinometer Systems: A Case Study in Southern Apennines, Italy. Nat. Hazards 2012, 61, 257–266. [Google Scholar] [CrossRef]
Benoit, L.; Briole, P.; Martin, O.; Thom, C.; Malet, J.-P.; Ulrich, P. Monitoring Landslide Displacements with the Geocube Wireless Network of Low-Cost GPS. Eng. Geol. 2015, 195, 111–121. [Google Scholar] [CrossRef]
Hu, Q.; Kou, Y.; Liu, J.; Liu, W.; Yang, J.; Li, S.; He, P.; Liu, X.; Ma, K.; Li, Y.; et al. TerraSAR-X and GNSS Data for Deformation Detection and Mechanism Analysis of a Deep Excavation Channel Section of the China South–North Water-Diversion Project. Remote Sens. 2023, 15, 3777. [Google Scholar] [CrossRef]
Jiang, N.; Li, H.-B.; Li, C.-J.; Xiao, H.-X.; Zhou, J.-W. A Fusion Method Using Terrestrial Laser Scanning and Unmanned Aerial Vehicle Photogrammetry for Landslide Deformation Monitoring Under Complex Terrain Conditions. IEEE Trans. Geosci. Remote Sens. 2022, 60, 4707214. [Google Scholar] [CrossRef]
Hamza, V.; Stopar, B.; Sterle, O.; Pavlovčič-Prešeren, P. A Cost-Effective GNSS Solution for Continuous Monitoring of Landslides. Remote Sens. 2023, 15, 2287. [Google Scholar] [CrossRef]
Li, W.; Zhan, W.; Lu, H.; Xu, Q.; Pei, X.; Wang, D.; Huang, R.; Ge, D. Precursors to Large Rockslides Visible on Optical Remote-Sensing Images and Their Implications for Landslide Early Detection. Landslides 2023, 20, 1–12. [Google Scholar] [CrossRef]
Dai, K.; Li, Z.; Xu, Q.; Tomas, R.; Li, T.; Jiang, L.; Zhang, J.; Yin, T.; Wang, H. Identification and Evaluation of the High Mountain Upper Slope Potential Landslide Based on Multi-Source Remote Sensing: The Aniangzhai Landslide Case Study. Landslides 2023, 20, 1405–1417. [Google Scholar] [CrossRef]
Liu, X.; Peng, Y.; Lu, Z.; Li, W.; Yu, J.; Ge, D.; Xiang, W. Feature-Fusion Segmentation Network for Landslide Detection Using High-Resolution Remote Sensing Images and Digital Elevation Model Data. IEEE Trans. Geosci. Remote Sens. 2023, 61, 4500314. [Google Scholar] [CrossRef]
Liu, K.; Liao, Y.; Yang, K.; Xi, K.; Chen, Q.; Tao, P.; Ke, T. Efficient Radiometric Triangulation for Aerial Image Consistency across Inter and Intra Variances. Int. J. Appl. Earth Obs. Geoinf. 2024, 130, 103911. [Google Scholar] [CrossRef]
Xin, W.; Pu, C.; Liu, W.; Liu, K. Landslide Surface Horizontal Displacement Monitoring Based on Image Recognition Technology and Computer Vision. Geomorphology 2023, 431, 108691. [Google Scholar] [CrossRef]
Abellán, A.; Oppikofer, T.; Jaboyedoff, M.; Rosser, N.J.; Lim, M.; Lato, M.J. Terrestrial Laser Scanning of Rock Slope Instabilities. Earth Surf. Process. Landf. 2014, 39, 80–97. [Google Scholar] [CrossRef]
Monserrat, O.; Crosetto, M.; Luzi, G. A Review of Ground-Based SAR Interferometry for Deformation Measurement. ISPRS J. Photogramm. Remote Sens. 2014, 93, 40–48. [Google Scholar] [CrossRef]
Li, M.; Zhang, L.; Ding, C.; Li, W.; Luo, H.; Liao, M.; Xu, Q. Retrieval of Historical Surface Displacements of the Baige Landslide from Time-Series SAR Observations for Retrospective Analysis of the Collapse Event. Remote Sens. Environ. 2020, 240, 111695. [Google Scholar] [CrossRef]
Tao, P.; Xi, K.; Niu, Z.; Chen, Q.; Liao, Y.; Liu, Y.; Liu, K.; Zhang, Z. Optimal Selection from Extremely Redundant Satellite Images for Efficient Large-Scale Mapping. ISPRS J. Photogramm. Remote Sens. 2022, 194, 21–38. [Google Scholar] [CrossRef]
Zhou, C.; Cao, Y.; Gan, L.; Wang, Y.; Motagh, M.; Roessner, S.; Hu, X.; Yin, K. A Novel Framework for Landslide Displacement Prediction Using MT-InSAR and Machine Learning Techniques. Eng. Geol. 2024, 334, 107497. [Google Scholar] [CrossRef]
Zeng, T.; Wu, L.; Hayakawa, Y.S.; Yin, K.; Gui, L.; Jin, B.; Guo, Z.; Peduto, D. Advanced Integration of Ensemble Learning and MT-InSAR for Enhanced Slow-Moving Landslide Susceptibility Zoning. Eng. Geol. 2024, 331, 107436. [Google Scholar] [CrossRef]
Niu, Z.; Xia, H.; Tao, P.; Ke, T. Accuracy Assessment of UAV Photogrammetry System with RTK Measurements for Direct Georeferencing. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2024, 10, 169–176. [Google Scholar] [CrossRef]
Hussain, Y.; Schlögel, R.; Innocenti, A.; Hamza, O.; Iannucci, R.; Martino, S.; Havenith, H.-B. Review on the Geophysical and UAV-Based Methods Applied to Landslides. Remote Sens. 2022, 14, 4564. [Google Scholar] [CrossRef]
Jiang, N.; Li, H.; Hu, Y.; Zhang, J.; Dai, W.; Li, C.; Zhou, J.-W. A Monitoring Method Integrating Terrestrial Laser Scanning and Unmanned Aerial Vehicles for Different Landslide Deformation Patterns. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 10242–10255. [Google Scholar] [CrossRef]
Ciccarese, G.; Tondo, M.; Mulas, M.; Bertolini, G.; Corsini, A. Rapid Assessment of Landslide Dynamics by UAV-RTK Repeated Surveys Using Ground Targets: The Ca’ Lita Landslide (Northern Apennines, Italy). Remote Sens. 2024, 16, 1032. [Google Scholar] [CrossRef]
Xi, K.; Duan, Y. AMS-3000 Large Field View Aerial Mapping System: Basic Principles and The Workflow. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2020, 43, 79–84. [Google Scholar] [CrossRef]
França Pereira, F.; Sussel Gonçalves Mendes, T.; Jorge Coelho Simões, S.; Roberto Magalhães de Andrade, M.; Luiz Lopes Reiss, M.; Fortes Cavalcante Renk, J.; Correia da Silva Santos, T. Comparison of LiDAR- and UAV-Derived Data for Landslide Susceptibility Mapping Using Random Forest Algorithm. Landslides 2023, 20, 579–600. [Google Scholar] [CrossRef]
Nikolakopoulos, K.G.; Kyriou, A.; Koukouvelas, I.K.; Tomaras, N.; Lyros, E. UAV, GNSS, and InSAR Data Analyses for Landslide Monitoring in a Mountainous Village in Western Greece. Remote Sens. 2023, 15, 2870. [Google Scholar] [CrossRef]
Zhou, J.; Jiang, N.; Li, C.; Li, H. A Landslide Monitoring Method Using Data from Unmanned Aerial Vehicle and Terrestrial Laser Scanning with Insufficient and Inaccurate Ground Control Points. J. Rock Mech. Geotech. Eng. 2024; in press. [Google Scholar] [CrossRef]
Furukawa, Y.; Ponce, J. Accurate, Dense, and Robust Multiview Stereopsis. IEEE Trans. Pattern Anal. Mach. Intell. 2010, 32, 1362–1376. [Google Scholar] [CrossRef] [PubMed]
Samodra, G.; Ramadhan, M.F.; Sartohadi, J.; Setiawan, M.A.; Christanto, N.; Sukmawijaya, A. Characterization of Displacement and Internal Structure of Landslides from Multitemporal UAV and ERT Imaging. Landslides 2020, 17, 2455–2468. [Google Scholar] [CrossRef]
Turner, D.; Lucieer, A.; De Jong, S.M. Time Series Analysis of Landslide Dynamics Using an Unmanned Aerial Vehicle (UAV). Remote Sens. 2015, 7, 1736–1757. [Google Scholar] [CrossRef]
Teo, T.-A.; Fu, Y.-J.; Li, K.-W.; Weng, M.-C.; Yang, C.-M. Comparison between Image- and Surface-Derived Displacement Fields for Landslide Monitoring Using an Unmanned Aerial Vehicle. Int. J. Appl. Earth Obs. Geoinf. 2023, 116, 103164. [Google Scholar] [CrossRef]
Batur, M.; Yilmaz, O.; Ozener, H. A Case Study of Deformation Measurements of Istanbul Land Walls via Terrestrial Laser Scanning. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 6362–6371. [Google Scholar] [CrossRef]
He, H.; Ming, Z.; Zhang, J.; Wang, L.; Yang, R.; Chen, T.; Zhou, F. Robust Estimation of Landslide Displacement from Multitemporal UAV Photogrammetry-Derived Point Clouds. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 6627–6641. [Google Scholar] [CrossRef]
Nourbakhshbeidokhti, S.; Kinoshita, A.M.; Chin, A.; Florsheim, J.L. A Workflow to Estimate Topographic and Volumetric Changes and Errors in Channel Sedimentation after Disturbance. Remote Sens. 2019, 11, 586. [Google Scholar] [CrossRef]
Lague, D.; Brodu, N.; Leroux, J. Accurate 3D Comparison of Complex Topography with Terrestrial Laser Scanner: Application to the Rangitikei Canyon (N-Z). ISPRS J. Photogramm. Remote Sens. 2013, 82, 10–26. [Google Scholar] [CrossRef]
Huang, R.; Jiang, L.; Shen, X.; Dong, Z.; Zhou, Q.; Yang, B.; Wang, H. An Efficient Method of Monitoring Slow-Moving Landslides with Long-Range Terrestrial Laser Scanning: A Case Study of the Dashu Landslide in the Three Gorges Reservoir Region, China. Landslides 2019, 16, 839–855. [Google Scholar] [CrossRef]
Jafari, B.; Khaloo, A.; Lattanzi, D. Deformation Tracking in 3D Point Clouds Via Statistical Sampling of Direct Cloud-to-Cloud Distances. J. Nondestruct. Eval. 2017, 36, 65. [Google Scholar] [CrossRef]
Gojcic, Z.; Schmid, L.; Wieser, A. Dense 3D Displacement Vector Fields for Point Cloud-Based Landslide Monitoring. Landslides 2021, 18, 3821–3832. [Google Scholar] [CrossRef]
Qin, Y.; Duan, Y. A method for measuring large-scale deformation of landslide bodies based on nap-of-the-object photogrammetry. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2023, 48, 29–35. [Google Scholar] [CrossRef]
Johnson, A.E.; Hebert, M. Using Spin Images for Efficient Object Recognition in Cluttered 3D Scenes. IEEE Trans. Pattern Anal. Mach. Intell. 1999, 21, 433–449. [Google Scholar] [CrossRef]
Frome, A.; Huber, D.; Kolluri, R.; Bülow, T.; Malik, J. Recognizing Objects in Range Data Using Regional Point Descriptors. In Proceedings of the Computer Vision—ECCV 2004, Prague, Czech Republic, 11–14 May 2004; Pajdla, T., Matas, J., Eds.; Springer: Berlin/Heidelberg, Germany, 2004; pp. 224–237. [Google Scholar]
Guo, Y.; Sohel, F.; Bennamoun, M.; Lu, M.; Wan, J. Rotational Projection Statistics for 3D Local Surface Description and Object Recognition. Int. J. Comput. Vis. 2013, 105, 63–86. [Google Scholar] [CrossRef]
Rusu, R.B.; Blodow, N.; Beetz, M. Fast Point Feature Histograms (FPFH) for 3D Registration. In Proceedings of the 2009 IEEE International Conference on Robotics and Automation, Kobe, Japan, 12–17 May 2009; pp. 3212–3217. [Google Scholar]
Tombari, F.; Salti, S.; Di Stefano, L. Unique Signatures of Histograms for Local Surface Description. In Proceedings of the Computer Vision—ECCV, Crete, Greece, 5–11 September 2010; Daniilidis, K., Maragos, P., Paragios, N., Eds.; Springer: Berlin/Heidelberg, Germany, 2010; pp. 356–369. [Google Scholar]
Yang, B.; Liu, Y.; Dong, Z.; Liang, F.; Li, B.; Peng, X. 3D Local Feature BKD to Extract Road Information from Mobile Laser Scanning Point Clouds. ISPRS J. Photogramm. Remote Sens. 2017, 130, 329–343. [Google Scholar] [CrossRef]
Bai, X.; Luo, Z.; Zhou, L.; Fu, H.; Quan, L.; Tai, C.-L. D3Feat: Joint Learning of Dense Detection and Description of 3D Local Features. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; IEEE: Seattle, WA, USA, 2020; pp. 6358–6366. [Google Scholar]
Ao, S.; Hu, Q.; Yang, B.; Markham, A.; Guo, Y. SpinNet: Learning a General Surface Descriptor for 3D Point Cloud Registration. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; IEEE: Nashville, TN, USA, 2021; pp. 11748–11757. [Google Scholar]
Zhang, Z.; Duan, Y.; Tao, P. From Ground Control Point to Digital Control Photo. Geomat. Inf. Sci. Wuhan Univ. 2023, 48, 1715–1723. [Google Scholar] [CrossRef]
Jung, J.; Che, E.; Olsen, M.J.; Shafer, K.C. Automated and Efficient Powerline Extraction from Laser Scanning Data Using a Voxel-Based Subsampling with Hierarchical Approach. ISPRS J. Photogramm. Remote Sens. 2020, 163, 343–361. [Google Scholar] [CrossRef]
Wang, R.; Yan, J.; Yang, X. Neural Graph Matching Network: Learning Lawler’s Quadratic Assignment Problem With Extension to Hypergraph and Multiple-Graph Matching. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 44, 5261–5279. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Workflow of the proposed 3D deformation estimation framework for multi-temporal geological landslide UAV images.

Figure 2. Voxel-based interest point extraction. (a) Initial mesh point cloud; (b) voxel size is 10.0; (c) voxel size is 2.0.

Figure 3. The process of generating texture feature descriptors from local 2D images.

Figure 4. The process of generating shape feature descriptors.

Figure 5. Overview of the study area. (a) Location of the Lijie north hill geohazard area in Gansu province, China. (b) The orthophoto of the test area. (c) Distribution of the manually measured checkpoints.

Figure 6. Distribution of deformations in the XYZ directions for the manually measured checkpoints, where the horizontal axis represents the serial numbers of the checkpoints, and the vertical axis shows the deformation values in each direction.

Figure 7. Interest points extracted by the proposed method and baseline methods. The color of interest points changes from blue to red, representing elevation values from low to high.

Figure 8. Comparative 3D point matches in the Lijie north hill test area.

Figure 9. Comparison of point matching results by the proposed method and baseline methods in the local area of the Lijie north hill. Subplot (a) represents the interest points that need to be matched, and (b) shows the ground truth results for homologous points. In subplots (c–h), the yellow points represent the homologous points matched by each respective method, and the red points represent the ground truth homologous points.

Figure 10. Distribution of residual errors in the X, Y, and Z directions, where each subplot’s x-axis indicates the corresponding residual value, and the y-axis denotes frequency. The subplot (a) represents the distribution of residual errors in the X direction, subplot (b) represents the Y direction, and subplot (c) represents the Z direction.

Figure 11. Distribution of the GNSS stations.

Table 1. Quantitative analysis results of the extracted interest points in terms of distribution and efficiency. The best values for the different metrics are highlighted in bold.

Method	Displacement Point Coverage (%)	Number of Interest Point	Extracting Time (Seconds)
SIFT3D	$10.2$	1579	$247$
Harris3D	$42.2$	3287	$268$
ISS3D	$53.7$	3653	$356$
The proposed method	$100.0$	4091	$195$

Table 2. Effectiveness comparison of 3D point matching results. The best values for the different metrics are highlighted in bold.

Method	RMSE (m)			$σ$ (m)			CMR (%)
Method	$X$	$Y$	$Z$	$X$	$Y$	$Z$	CMR (%)
3DSC	0.723	0.684	0.841	0.695	0.663	0.789	32.2
FPFH	0.203	0.231	0.195	0.207	0.225	0.186	66.7
SHOT	0.184	0.188	0.174	0.181	0.179	0.163	79.4
RoPS	0.089	0.085	0.092	0.086	0.083	0.094	90.7
SpinNet	0.187	0.186	0.195	0.182	0.189	0.187	76.5
The proposed method	0.010	0.010	0.011	0.009	0.010	0.011	100.0

Table 3. Comparison of the deformation estimations and the GNSS measurements.

GNSS ID	Deformation Estimations of the Proposed Method (cm)	GNSS Measurement (cm)	Error (cm)
GNSS ID	Deformation Estimations of the Proposed Method (cm)	GNSS Measurement (cm)	$Δ X$	$Δ Y$	$Δ Z$	Sum
1	19.0	17.8	−0.6	−0.6	−0.9	1.2
2	21.2	18.5	−1.4	−1.2	−2.0	2.7
3	11.7	9.1	−0.4	−0.8	−2.4	2.6
4	15.5	13.7	−0.2	−1.3	−1.3	1.8
5	40.6	43.2	1.8	0.4	1.9	−2.6

Table 4. Accuracy comparison of 3D point matching results by using different features. The best values for the different metrics are highlighted in bold.

Method	RMSE (m)			$σ$ (m)
Method	$X$	$Y$	$Z$	$X$	$Y$	$Z$
Only shape feature	0.079	0.081	0.086	0.075	0.078	0.083
Only texture feature	0.017	0.018	0.020	0.015	0.017	0.019
Shape and texture features	0.010	0.010	0.011	0.009	0.010	0.011

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Xi, K.; Tao, P.; Niu, Z.; Zhu, X.; Duan, Y.; Ke, T.; Zhang, Z. Three-Dimensional Deformation Estimation from Multi-Temporal Real-Scene Models for Landslide Monitoring. Remote Sens. 2024, 16, 2705. https://doi.org/10.3390/rs16152705

AMA Style

Xi K, Tao P, Niu Z, Zhu X, Duan Y, Ke T, Zhang Z. Three-Dimensional Deformation Estimation from Multi-Temporal Real-Scene Models for Landslide Monitoring. Remote Sensing. 2024; 16(15):2705. https://doi.org/10.3390/rs16152705

Chicago/Turabian Style

Xi, Ke, Pengjie Tao, Zhuangqun Niu, Xiaokun Zhu, Yansong Duan, Tao Ke, and Zuxun Zhang. 2024. "Three-Dimensional Deformation Estimation from Multi-Temporal Real-Scene Models for Landslide Monitoring" Remote Sensing 16, no. 15: 2705. https://doi.org/10.3390/rs16152705

APA Style

Xi, K., Tao, P., Niu, Z., Zhu, X., Duan, Y., Ke, T., & Zhang, Z. (2024). Three-Dimensional Deformation Estimation from Multi-Temporal Real-Scene Models for Landslide Monitoring. Remote Sensing, 16(15), 2705. https://doi.org/10.3390/rs16152705

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Three-Dimensional Deformation Estimation from Multi-Temporal Real-Scene Models for Landslide Monitoring

Abstract

1. Introduction

2. Methods

2.1. Voxel-Based Interest Points Extraction

2.2. Texture Feature Descriptor

2.3. Shape Feature Descriptor

2.4. Features Integration

3. Experiments and Results

3.1. Experiment Data

3.2. Evaluation Metrics

3.3. Implementation Details

3.4. Comparative Experiment on Interest Points Extraction

3.5. Three-Dimensional Point Matching Comparative Experiment

3.6. Comparison with GNSS-Measured Values

4. Discussion

4.1. Effectiveness of Integrating Shape and Texture Features

4.2. Limitations

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI