A Novel Context-Aware Douglas–Peucker (CADP) Trajectory Compression Method

Mehri, Saeed; Hooshangi, Navid; Mahdizadeh Gharakhanlou, Navid

doi:10.3390/ijgi14020058

Open AccessArticle

A Novel Context-Aware Douglas–Peucker (CADP) Trajectory Compression Method

by

Saeed Mehri

¹

,

Navid Hooshangi

² and

Navid Mahdizadeh Gharakhanlou

^3,*

¹

Geomatics Engineering Department, Faculty of Engineering, University of Zanjan, Zanjan 45371-38791, Iran

²

Department of Geoscience Engineering, Arak University of Technology, Arak 38181-46763, Iran

³

Laboratory of Environmental Geosimulation (LEDGE), Department of Geography, University of Montreal, 1375 Avenue Thérèse-Lavoie-Roux, Montréal, QC H2V 0B3, Canada

^*

Author to whom correspondence should be addressed.

ISPRS Int. J. Geo-Inf. 2025, 14(2), 58; https://doi.org/10.3390/ijgi14020058

Submission received: 22 October 2024 / Revised: 25 January 2025 / Accepted: 28 January 2025 / Published: 1 February 2025

Download

Browse Figures

Versions Notes

Abstract

:

Most traditional trajectory compression methods, such as the Douglas–Peucker (DP) method, consider only spatial characteristics and disregard contextual factors, including environmental context. This paper proposes a new way of trajectory formulation by considering all spatial, internal, environmental, and semantic contexts to capture all contextual aspects of moving objects. Then, we propose the Context-Aware Douglas–Peucker (CADP) method for trajectory compression. These facts are confirmed by experiments with real AIS data showing that, while CADP preserves the same computational efficiency of DP (i.e., at O(n²)), it outperforms DP and two-stage Context-Aware Piecewise Linear Segmentation (two-stage CPLS) methods in preserving agent movement behavior, obtaining compressed trajectories that are closer to the original ones and that are much more useful in base analyses such as trajectory prediction. Specifically, the LSTM-based models trained on CADP-compressed trajectories have relatively lower RMSEs than others compressed by either DP or two-stage CPLS. Therefore, CADP is more scalable and efficient, thus making it more practical for large-scale engineering applications; with the improvement in trajectory analysis accuracy achieved by the suggested method, a wide range of critical engineering applications can be potentially improved, such as collision avoidance and route planning. Future work will focus on spatial auto-correlation and uncertainty to extend the robustness and applicability of the approach.

Keywords:

trajectory; compression; context awareness; data compression; Douglas–Peucker; prediction

1. Introduction

The increasing volumes of trajectory datasets bring challenges in storage, querying, and processing [1,2,3], which can be addressed by applying trajectory compression (or simplification). Trajectory compression refers to the elimination of points in a trajectory which do not contain new information [4]. Through trajectory compression, the overall volume of data is reduced, thereby facilitating data storage and analysis [4,5]. It accelerates and improves the performance of subsequent trajectory processes [2,5]. In practical applications, trajectory compression is an effective method for preprocessing and visualizing massive trajectory data [6].

Algorithms use different criteria to identify the redundant trajectory points [5] such as offset distance [7], spatial topology [8], time, velocity, or combinations of them, i.e., a combination of time and distance [5], or a combination of velocity and distance [9]. Although the compression algorithm eliminates the redundant data, it must preserve the moving object’s behavioral information within the trajectory [8], minimize information loss, and maintain trajectory quality [10].

The ultimate criterion for trajectory quality is the extent to which the trajectory can be compressed without damaging its use for further processing [9], meaning that there is a trade-off between the number of eliminated points and trajectory quality after compression [11]. Therefore, maintaining the trajectory quality and obtaining a consistent or meaningful output is a complicated problem [12]. A trajectory compression method is typically evaluated based on (1) possessing time, (2) compression ratio, and (3) error measurement [2], which are (1) the duration of the compression process, (2) the ratio between the size of simplified and actual trajectories, and (3) deviations between compressed and actual trajectories, respectively [13]. The processing time is used to evaluate the efficiency, while the remaining metrics are used to assess the effectiveness of the processes [13].

In trajectory databases, various techniques have been developed to compress trajectory data. The most common method is Piecewise Linear Segmentation (PLS) [14], which is intuitive, easy to implement, and relatively fast [15]. The Douglas–Peucker (DP) algorithm, another popular method, is based on PLS [16] and guarantees an error-bound compression process [1]. PLS and DP have a worst-case time complexity of O(n²). While these methods are straightforward, obtaining consistent or meaningful results can be challenging [12]. To address this, DPhull was introduced, leveraging convex hull properties to achieve the same compression ratio as DP while reducing the complexity to O(n log n) [1].

To address both spatial and temporal dimensions of trajectory data, the TD-DP method was developed [17]. It uses Synchronous Euclidean Distance (SED) to account for temporal characteristics. Additionally, Zhang et al. [18] introduced the Adaptive Core Threshold Difference-DP (ACTD-DP) algorithm, which builds on DP by incorporating ships’ course values. The ACTD-DP compresses trajectories by employing curve fitting to establish a mathematical relationship between the compression threshold and the number of points. This function aids in determining the optimal core threshold, a critical parameter for controlling the algorithm’s accuracy and efficiency. Guo et al. [19] proposed Top-Down Kinematic Compression (TDKC), an adaptive algorithm designed to preserve key features in AIS trajectory data, such as time, position, speed, and course. TDKC uses a Compression Binary Tree to address recursion termination and automatic threshold determination. A case study in the Gulf of Finland demonstrated TDKC’s superior performance over conventional and improved algorithms, proving its effectiveness in maritime traffic analysis.

Semantic trajectory compression has also gained attention. Zhou et al. [20] proposed a method that integrates semantic and spatio-temporal features using information entropy and Synchronous Euclidean Distance. By prioritizing feature points with maximum semantic-spatio-temporal distance, the approach preserves semantic similarity and enhances the interpretability of compressed trajectories. Gao et al. [21] introduced another semantic trajectory compression technique inspired by synchronization. It uses a multi-resolution clustering model to create hierarchical semantic regions of interest (ROIs), compressing trajectories into sequences of semantic ROIs. This model is extendable to data streams and validated on synthetic and real-world datasets, demonstrating its efficiency and effectiveness.

Compression methods often face challenges when using position-preserving error measures, as these can lose critical information required for analyses such as trajectory clustering. To address this, Direction-Preserving Trajectory Simplification (DPTS) was developed [22]. By employing an angular distance measure, DPTS preserves both directional and positional information. However, this approach can increase computational complexity, illustrating the challenge of balancing eliminating redundant points with the retention of meaningful movement patterns. This balance is vital for applications such as behavior analysis [11,12].

A GPU-based parallel computing framework was developed with the Adaptive DP with Speed and Course (ADPSC) algorithm for large datasets and real-time applications. This framework optimizes threshold calculations and accelerates trajectory compression processes. Experiments conducted on vessel trajectory datasets demonstrated the superior performance of the ADPSC algorithm in compressing trajectories. The GPU framework significantly reduced processing times, supporting real-time decision making. These advancements enhance maritime safety and minimize data storage costs, providing critical support for autonomous shipping in complex waters [23].

The movement of objects, such as animals, people, or vehicles, is often influenced by context, like weather, which can either enable or restrict movement [24,25]. For instance, vessels avoid rough seas or shallow waters, and humans become more aggressive in high temperatures [26,27,28]. Therefore, ignoring context in trajectory compression can lead to inconsistent results [8,12]. For example, in maritime trajectories, compression may produce segments that intersect coastlines, rendering them invalid. To address this, a contextual DP algorithm was developed to preserve topology and direction. However, it struggles with retaining stopping points, which can misrepresent object behavior [9,12]. A two-stage context-aware PLS was proposed to solve the problem of retaining stopping points. This method first analyzes speed to detect stops and then integrates topological relations into PLS to ensure logical consistency in the simplified trajectory [8]. By incorporating context, this approach addresses the shortcomings of traditional methods, enabling more accurate and reliable trajectory compression for applications such as behavior analysis and movement modeling.

Previous context-aware trajectory compression methods primarily focus on the geometric properties of context, which leaves a persistent risk of losing meaningful trajectory points. This paper proposes incorporating spatial, non-spatial, and semantic context domains into the trajectory compression process to address this limitation. Initially, a novel context-aware trajectory formulation is introduced to effectively capture the contextual information surrounding the moving agent. Building on this foundation, a context-aware Douglas–Peucker algorithm is developed to minimize the risk of discarding significant trajectory points.

The remainder of the paper will discuss the following sections: The Dp algorithm and proposed methodology are summarized in Section 2. Section 3 presents the results of the implementation and discusses them. A conclusion is presented in Section 4 of the document, along with suggestions for future research.

2. Materials and Methods

2.1. Douglas–Peucker Algorithm

The DP algorithm compresses a trajectory T into linear segments by recursively keeping the points whose elimination causes an error higher than a fixed spatial threshold ϵ_max [16]. The pseudo-code of the algorithm is given in Algorithm 1.

Algorithm 1. The pseudo-code of the DP algorithm
1	$e_{t e m p} = 0$
2	$i_{m a x} = 0$
3	$f o r i = 1 t o (n - 1)$
4	$e = e_{i} (t_{i}, t_{1}, t_{n})$
5	$i f e_{i} (t_{i}, t_{1}, t_{n}) > e_{t e m p}$
6	$i_{m a x} = 1$
7	$e_{t e m p} = e$
8	$i f e_{t e m p} > ϵ_{m a x} t h e n$
9	$A = D P (T (1, i_{m a x}), ϵ_{m a x})$
10	$B = D P (T (i_{m a x}, n), ϵ_{m a x})$
11	$T_{c} = A, B (2, n), T_{c} : c o m p r e s s e d t r a j e c t o r y$
12	$e l s e$
13	$T_{c} = t_{1}, t_{n}$ $(remove vertices t_{2}, \dots, t_{n - 1}$ )
14	$r e t u r n T_{c}$

The DP selects the first and last points of the trajectory of length n, i.e., t_start = (x_start, y_start) and t_end = (x_start, y_start). Then, for all intermediate trajectory points t_i, i.e., points between t_start and t_end, the amount of displacement in the simplified trajectory, an error caused by the elimination of t_i, is calculated by Equation (1):

e_{i} = \sqrt{{(x_{i} - \overset{’}{x_{i}})}^{2} + {{(y}_{i} - \overset{’}{y_{i}})}^{2}}

(1)

where the (x_i, y_i) is a projection of t_i on the line segment connecting t_start, t_end. If the computed displacement e_i, caused by the elimination of t_i, is greater than

ϵ

_max, the t_i has useful information. So, the DP keeps it in the simplified trajectory and applies the procedure again, with the corresponding point t_i creating two new line segments, t_start, t_i to t_i, t_end.

2.2. Context-Aware Trajectory Formulation

In a basic trajectory, each point consists of spatial and temporal domains. Thus, trajectory T, can be defined by Equation (2):

T_{e} = [(x_{1}, y_{1}, t_{1}), (x_{2}, y_{2}, t_{2}), \dots, (x_{n}, y_{n}, t_{n})]

(2)

where (x_i, y_i) is the spatial domain and represents the coordinates of each trajectory point at time t_i; t_i is a temporal domain, showing the timestamp associated with each point. Considering context, it is possible to extend the trajectory representation by adding contextual dimensions, including environmental, semantic, and internal dimensions, to each point. These contextual dimensions add information about the semantic, external, and internal conditions influencing the object’s movement.

Environmental Context Domain: C_env

Based on environmental conditions, this domain includes any external factors affecting the agent’s movement, such as weather conditions (temperature, humidity), terrain features (elevation, slope), or traffic congestion in urban settings. So, the environmental context for each point can be represented as Equation (3):

C_{e n v, i} = [w e a t h e r, t e r r a i n, t r a f f i c c o n d i t i o n s, \dots]

(3)

where the c_env,i is any factor involving environment condition.

Semantic Context Domain: C_sem

The semantic context could capture the intent or purpose behind the movement, such as the mode of transportation (walking, driving), route type (highway, local street), or the movement’s goal (commuting, recreation). The semantic context can be written as (Equation (4)).

C_{e n v, i} = [m o d e, r o u t e t y p e, g o a l, \dots]

(4)

Internal Context Domain: C_int

The internal context includes information about the moving agent itself, such as its speed, acceleration, or even energy consumption. The internal context for each point can be represented as (Equation (5)).

C_{i n t, i} = [s p e e d, a c c e l e r a t i o n, e n e r g y u s a g e, \dots]

(5)

Therefore, a context-aware trajectory T_ctx extends the traditional trajectory by embedding these contextual dimensions. Each point in the trajectory is now augmented with its associated environmental, semantic, and internal contexts as Equation (6).

T_{c t x} = [(x_{1}, y_{1}, t_{1}, C_{e n v, 1}, C_{s e m, 1}, C_{i n t, 1}), \dots, (x_{n}, y_{n}, t_{n}, C_{e n v, n}, C_{s e m, n}, C_{i n t, n})]

(6)

This structure is applicable in a wide range of domains, from urban mobility to animal tracking, enabling richer analysis by considering external conditions and internal states influencing the movement of agents. The T_ctx can be generalized as a function of spatial coordinates, time, and context domains (Equation (7)). Such formulation allows trajectories to be understood and predicted not just as spatio–temporal data but also in context, i.e., external influences and internal conditions.

T_{c t x} = f (x, y, t, C_{e n v}, C_{s e m}, C_{i n t})

(7)

2.3. Context-Aware Douglas–Peucker Algorithm

Context-aware trajectory compression requires defining an appropriate error measure for embedded contextual dimensions. Each point in the trajectory is now augmented with its associated environmental, semantic, and internal contexts. The sampling interval may vary during trajectory observation, e.g., in speed-dependent observation situations. Consequently, such context should be adjusted to reflect the effects of different observation intervals when defining error measures. Accordingly, an error measure is defined using Equation (8):

e_{c o n e t x t} = \frac{c_{i} - c_{j}}{t_{i} - t_{j}}

(8)

where c_i and t_i are the context values at the observation time of the i-th trajectory point, respectively. Equation (8) quantifies the contextual error between two trajectory points by normalizing the difference in contextual values (c_i − c_j) by the temporal interval (t_i − t_j). Hence, the formulation captures context dynamics across trajectories that may have been observed at different sampling intervals. This kind of approach would normalize contextual attributes to consider temporal variability in trajectory analysis, as discussed in similar works [29,30].

The numerator, (c_i − c_j), stands for the difference in the contextual dimension, which can be environmental, like temperature or wind speed; semantic, such as points of interest or activities; or internal states, like energy consumption. This work is based on the assumption that contextual dimensions are meaningful and measurable, thus being the best practices upon which to model context-aware trajectories [31]. The denominator, (t_i − t_j), normalizes context change (numerator value) over the temporal interval between two observations; hence, context changes are scaled appropriately across different sampling intervals. This adjustment is crucial in scenarios like speed-dependent observation, where the frequency of sampling may not be uniform [32].

We are also aware of the special challenges presented by the non-nominal context, such as modes of transportation. In this respect, our approach has ensured that all change points within the categorical contexts are explicit. This is inspired by the basic idea of transitions in context; custom distance metrics may be integrated to preserve distinctions between categories and ensure meaningful transitions are not overlooked [33].

In addition, the trajectory compression method has to generate simplified trajectories, which are more similar to the original ones [8]. This will ensure that the behavioral characteristics of the moving agent remain in the compressed trajectory. So, by checking all spatial, nominal, and non-nominal domains of the context-aware trajectory compression, the method seems to generate more similar compressed trajectories to the original ones. The pseudo-code of the proposed context-aware DP (CADP) is given in Algorithm 2.

Algorithm 2. The pseudo-code of the context-aware DP algorithm
1	$i n p u t s : T_{c t x}, ϵ_{{C_{e n v}}_{m a x}}, ϵ_{{C_{i n t}}_{m a x}}, ϵ_{{S p a t a i l}_{m a x}}$
2	$U_{p} = \emptyset$
3	$f o r i = 1 t o (n - 1)$
	Checking environmental context
4	$e_{c o n e t x t} =$ $e_{c_{{e n v}_{i}}}$
5	$i f e_{c_{e n v}} (t_{i}, t_{1}, t_{n}) > ϵ_{{C_{e n v}}_{m a x}}$
6	$U_{p} = a p p e n d (t_{i}),$ $t_{i} h a s u s e f u l e n v i r o n m e n t a l c o n t e x t$
	Checking internal context
7	$e_{c o n e t x t} =$ $ϵ_{{C_{i n t}}_{m a x}}$
8	$i f e_{c_{i n t}} (t_{i}, t_{1}, t_{n}) > ϵ_{{C_{i n t}}_{m a x}}$
9	$U_{p} = a p p e n d (t_{i}),$ $t_{i} h a s i n t e r n a l s e m a n t i c c o n t e x t$
	Checking Semantic context
10	$i f {C_{s e m}}_{i} \neq {C_{s e m}}_{j}$
11	$U_{p} = a p p e n d (t_{i}),$ $t_{i} h a s u s e f u l s e m a n t i c c o n t e x t$
12	$T_{D P} = D P (T_{e}),$ $s i m p l i f y i n g t r a j e c t o r y u s i n g D P$
13	$T_{C o n t e x t - A w a r e} = T_{D P} + U_{p}$
14	$r e t u r n T_{C o n t e x t - A w a r e}$

2.4. Performance Evaluation

This paper evaluates the CADP based on compression ratio, similarity, and computational efficiency. The compression ratio is defined according to Equation (9):

R = (1 - \frac{N}{M}) \times 100 %

(9)

where N and M are the numbers of trajectory points after and before compression, respectively. The Dynamic Time Warping (DTW) method, defined in Equation (10), measures the similarity between simplified and original trajectory plots [34]. With varying sampling rates, it warps trajectory non-linearly. This method minimizes the cumulative distance over potential paths between two trajectory points for two trajectories, T_i and T_j:

D T W (T_{i}, T_{j}) = m i n \sum_{1}^{n} δ (ω_{s})

(10)

where the δ(ω_s) is a distance function and T_i and T_j are the original and compressed trajectories. According to Equation (10), different distance functions can be used in DTW [35]. This work uses the Euclidean distance as a distance function in DTW.

While adding new domains to the traditional trajectory compression methods, i.e., context domains, seems to increase the accuracy of the compression and keep meaningful information, it may increase the computational complexity of the compression process, raising concerns about its applicability to large-scale datasets [36]. Therefore, understanding the computational trade-offs is essential to ensure that the benefits of context-enriched trajectory compression outweigh the potential performance drawbacks in practical scenarios [37].

However, most practical applications, especially large-scale systems such as transportation networks and IoT-enabled environments, require real-time or near-real-time processing. It is, thus, very important to strictly evaluate the computational efficiency of the proposed method so that a trade-off can be made between compression performance and computational efficiency. This paper addresses these apprehensions through a computational efficiency analysis that has been conducted and presented to deliver insights into the scalability and practicability of the proposed method to make it feasible to be deployed for data-intensive applications.

Therefore, to measure the computational efficiency of the proposed trajectory compression method, two key metrics are used: runtime and the theoretical time complexity, O(n^k). Runtime is an empirical measure referring to the time the algorithm executes over datasets with various sizes and complexities. It captures practical performance under real conditions. It is very important for understanding scalability, particularly in processing large-scale trajectory datasets enriched with new domains of context [38].

Meanwhile, the theoretical time complexity O(n^k), where k represents the algorithm’s degree of polynomial growth, provides a more formal framework explaining the relation between input size n and computational cost. So, the O(n^k), on the other hand, conveys the algorithm’s intrinsic scalability and possible computational bottlenecks by analyzing nested operations and their impact due to added contextual domains. These two metrics together will give a full-fledged review of computational efficiency and ensure that the proposed method will not only be effective but also practical for large-scale applications [39,40].

3. Results and Discussion

3.1. Datasets and Preprocesses

This paper uses three datasets. First, vessel movement data were collected by Automatic Identification Systems (AIS), which contained 3,213,700 locations of 1125 ships traveling around the English Channel in February 2016 (Figure 1). These AIS messages are converted into trajectories using a Python library named MovingPandas [41].

According to [42], the most influential context factors for vessel trajectory prediction are waves’ direction, height, and water depth. Therefore, these factors are used in this paper in the environmental context. These parameters came from an ocean condition dataset [43] covering a longitude range from 0 W to 10 W and a latitude range from 45 N to 51 N. The dataset has a three-hour temporal resolution and contains 43,206 gridded sampling points.

To link movement and context data coming from different sources and having different temporal resolutions, this paper used the trajectory annotation method described in [42]. Trajectory annotation includes spatial and temporal interpolation [27]. As a result of annotation, each trajectory point contains an environmental context. A Python environment and libraries such as MovingPandas [41] and tslearn [44] are used for implementations. Trajectories are compressed using CADP with the maximum allowed distance threshold (D), ϵ_max, set to 1 km and different thresholds for context parameters. Then, runtime, a compression ratio R, and the similarity between the simplified and original trajectory are calculated. The CADP compressed trajectories are then used in a deep learning trajectory prediction model to evaluate the effectiveness of the CADP and its effects in further trajectory analysis.

3.2. Effects of Context on Compression Ratio and Similarity of Trajectories

Figure 2 shows that the compression ratio rises as the threshold increases. The larger the threshold, the higher the dtw (smaller similarity). Similarity graphs show sharp increases in all context parameters. They show critical points, meaning that, by increasing the threshold, the amount of useful data loss is greater than the volume reduction. In other words, the compression ratio increases more slowly after the critical point.

Furthermore, according to Figure 2, similar graphs of compressed and original datasets are converging. After the convergence point, increasing the context threshold has little effect on similarity reduction, meaning that most useful trajectory points containing useful context data are discarded. It is, therefore, possible to save those essential trajectory points using the CADP.

Table 1 shows the compression ratio and similarity of simplified trajectories with different wave direction settings in the CADP method. The compression ratio increased by 1.68% in point 3, while the similarity of compressed trajectories decreased by 72.99%. In other words, 1.68% of the trajectory points contain crucial information. Although the overall behavior of the CADP in all three experiments is the same, i.e., as the compression ratio increases, the similarity decreases, there are slight differences in context behavior. This could help to balance the similarity of the original and compressed datasets and select appropriate compression thresholds. Lastly, Figure 3 visualizes the trajectory before and after compression based on the CADP algorithm.

Additionally, the DP and two-stage Context-Aware Piecewise Linear Segmentation (CPLS) are implemented for better comparability. As the two-stage CPLS takes context and topological relationship into the trajectory compression process [6], comparisons of these methods will offer a better balance in testing the effectiveness of CADP for preserving spatial and contextual fidelity. In both algorithms, the spatial setting, i.e., ϵ_max, is set to =1 km.

In this case, the compression ratio was 76.78%, with a similarity score of 524,860 using the DP algorithm and 72.08% using the two-stage CPLS method with a corresponding similarity score of 334,029. The results reflect the trade-off between compression efficiency and trajectory fidelity. The DP had a higher compression ratio of 76.78% with a similarity score of 524,860, while the two-stage CPLS had a lower compression ratio of 72.08% with a similarity score of 334,029. Since the performance is reflected by a lower similarity score and a higher compression ratio, this result means that two-stage CPLS outperforms DP in trajectory compression with better retention of crucial spatial and contextual information. These findings provide a point of reference to compare the performance of the proposed CADP algorithm, whose goal is the simultaneous optimization of both compression efficiency and the retention of context.

3.3. Effects of CADP on Trajectory Prediction

Accurate vessel trajectory prediction can improve safety management at sea. To evaluate the CADP and the effects of a good compression method, this paper used a Long Short-Term Memory (LSTM) network currently used by various maritime systems and applications for trajectory prediction. This method is widely used in different maritime systems and applications, including collision avoidance, vessel route planning, and anomaly detection systems [42,45,46,47,48].

All trajectory datasets, which are compressed by DP, two-stage CPLS, and CADP, are divided into two main groups. The first group contains 80% of the data and is used for modeling (training). The remaining data are used as a test set for accuracy assessment. Parameter selection is a crucial task for recurrent neural networks and is a trial-and-error process. The Adam optimizer [49], which is a stochastic gradient descent and adaptive learning rate optimization method, was used in the experiments. It is well known that including more hidden layers in the deep neural network enhances the model’s performance and learning ability. Each layer contains 135 hidden units. The gradient threshold was set to 1 to prevent the gradients from exploding. Furthermore, the initial learning rate was 0.004, and the learning rate dropped after 140 epochs by multiplying by a factor of 0.2. The LSTM takes three steps of trajectory and predicts the next three steps, i.e., the average prediction interval is 15 s. According to the average speed of vessels in the dataset, in 15 s, a vessel moves about 141 m. Evaluation results are summarized in Table 2.

The results highlight the performance of context-aware compression algorithms like CADP and two-stage CPLS in reducing complexity with a gain in preserved relevant information from the trajectory data for further prediction. According to Table 2, validation RMSE values demonstrate general performance improvement within the context-aware compressed trajectories compared to their counterparts for most situations. DP-compressed data perform much better than uncompressed trajectories with an RMSE of 0.012226, which is much better than 0.048789. This is because compressing spatial trajectories significantly reduces the prediction error by removing redundant or less critical data points that do not contribute much to trajectory prediction accuracy and reducing the complexity of data during LSTM training.

The two-stage-CPLS-compressed trajectories result in an RMSE of 0.0047859. This is a substantial enhancement over DP-compressed data, but larger than the best-performing variant of CADP, Spatial + Water Depth. This likely implies that including topological relationships and contextual features in two-stage CPLS retains important information regarding trajectories that is essential for good prediction accuracy. However, the relatively higher RMSE compared to CADP might indicate that two-stage CPLS does not prioritize certain contextual dimensions as much as CADP does, since it is tailor-made to include some environmental factors such as depth of water and wave characteristics. The CADP-compressed trajectories contain specific contextual information, such as water depth, wave height, and water direction, and have the best overall prediction accuracy. This is probably because context-aware compression retains essential spatial information and integrates environmental factors that significantly influence vessel movement.

Spatial + Water Depth (Compressed by CADP) yields the lowest RMSE of 0.004195, suggesting that water depth plays a significant role in accurately predicting vessel trajectories. This may be because vessels adjust their movements based on water depth, particularly in shallow or coastal regions.
Spatial + Wave Height (Compressed by CADP) results in an RMSE of 0.005792, meaning wave height, though important, is a bit less influential in these data compared to water depth; it does still have an impact on the error reduction to provide a better prediction over a DP-compressed dataset.
Spatial + Water Direction (Compressed by CADP) shows an RMSE of 0.004738, demonstrating that water direction influences vessel movement and prediction accuracy, and it provides a better prediction over DP-compressed and two-stage-CPLS-compressed trajectories.

The model focuses on the most critical features by compressing trajectories before training and avoids overfitting redundant data. Two-stage CPLS and CADP provide advantages over DP by incorporating contextual features that improve prediction accuracy. However, CADP’s tailored approach to specific contextual dimensions, such as water depth, achieves the best performance, highlighting its effectiveness in reducing RMSE while maintaining computational efficiency. The results suggest that compression methods enhance prediction accuracy and optimize the training process by distilling essential trajectory characteristics.

3.4. Computational Efficiency Analysis

The worst-case theoretical time complexity for the proposed CADP algorithm is O(n²), which agrees with the DP algorithm. This means the changes introduced by CADP to add contextual awareness did not harm the DP’s theoretical time complexity. In contrast, the two-stage CPLS has a worst-case theoretical time complexity of O(n³), reflecting higher computational overhead than CADP. Moreover, such a performance reflects CADP’s scalability well and thus denotes evidence that adding contextual domains was achieved without sacrificing computational efficiency. So, such an efficient compression method seems to have significant potential for enhancing a wide range of applications of trajectory data analysis, such as location-based services, movement pattern analysis, and transportation optimization [8].

Besides the above theoretical time complexity evaluation, runtime measurements were conducted on trajectory datasets of varying sizes to benchmark the performance of the proposed CADP algorithm against the original DP algorithm. The experiments were performed in a controlled test environment using an AMD Ryzen 7 7700X 4.5 GHz CPU, 16 GB RAM, and an NVIDIA GeForce RTX 4060 Ti GPU. The experiments aimed to determine how much time both algorithms require to perform the task, compressing trajectories with a different number of points (n).

Annotated trajectory datasets of five sizes were prepared for experiments: 1000, 5000, 10,000, 50,000, and 100,000 points per trajectory. In all sets of experiments, the spatial compression threshold (ϵ) value was fixed as 10 m, and for CADP, the wave height, wave direction, and bottom depth threshold were 0.15 m, 1.5 degrees, and 80 m, respectively. Each experiment was repeated 10 times to account for variability, and the average runtimes were recorded (Table 3 and Figure 4). Since the runtime of the two-stage CPLS method was significantly different from the other algorithms, a separate figure (Figure 5) was created to compare the runtime performance of DP and CADP. This gives a better view of their relative scalability and computational efficiency.

The theoretical time complexity of the two-stage CPLS method a is O(n³), which is significantly higher than the O(n²) complexity of both the DP and CADP algorithms. The cubic time complexity results from how the method is designed to go through topological relationships for every point in the trajectory. It includes intensive operations such as testing pairwise relations and maintaining contextual and topological consistency along the trajectory. This O(n³) complexity becomes extremely prohibitive for large trajectory sizes, with an exponentially higher runtime than for O(n²) algorithms. For example, while the 100,000 points of CADP take roughly 3355 milliseconds to process, the CPLS method requires upwards of 127,315 milliseconds on the same dataset. This huge increase further emphasizes how computationally inefficient CPLS is in dealing with large-scale applications.

In contrast, it can be verified from the CADP algorithm that incorporating contextual parameters can be fulfilled without significant improvement in computational complexity; it keeps O(n²) complexity in the same class as DP. Thus, CADP balances very well between the two goals and becomes more acceptable for large-scale trajectory analysis. As mentioned above, such a time complexity difference at the theoretical level and in empirical performance manifests that CADP is taking further steps to enhance scalability on large-scale and/or real application scenarios with finite computation resources compared to other methods.

The results of the runtime analysis give an idea of the practical impact of this difference in complexity. For instance, whereas the DP algorithm takes 3190 milliseconds to process 100,000 points and the computationally most expensive variant of CADP (Spatial + Wave Direction) takes 3355 milliseconds, the two-stage CPLS method requires 127,315 milliseconds for the same number of points. Such a huge contrast reveals the inadequacy of CPLS for large-scale applications, where its runtime is almost 38 times higher than that of CADP.

Although the integration of contextual dimensions introduces additional overhead, CADP does so without increasing its theoretical time complexity beyond O(n²). For example, the CADP (Spatial + Water Depth) variant processes 100,000 points in 3290 milliseconds, while the CADP variants, namely, CADP (Spatial + Wave Height) and CADP (Spatial + Wave Direction), take 3340 and 3355 milliseconds, respectively. This increase over DP, at 3190 milliseconds, demonstrates how efficiently CADP balances context awareness and scalability. By comparison, the two-stage CPLS method, effective in retaining topological relationships and spatial consistency, has a prohibitive computational expense for large datasets. These results put into perspective the advantage of CADP, which joins computational efficiency at O(n²) algorithms with multiple contextual dimensions, making it more practical for large-scale trajectory compression and analysis tasks.

3.5. Limitations

While Equation (8) copes very well with nominal and non-nominal contexts, its applicability depends directly on the quality and consistency of context data. Inconsistent quality or sampling rates may affect the robustness of the error measure. Moreover, the integration of contextual domains has higher computational costs than a standard DP algorithm. Further optimizations, such as parallel processing or adaptive thresholds, might be investigated to overcome drawbacks and enhance scalability.

Another limitation of this study is data availability, as evaluating the proposed CADP algorithm relies on AIS data and environmental context. Although AIS data are widely used for maritime trajectory analysis because of their rich spatial and contextual information, the applicability of CADP to semantic contexts and other types of trajectory datasets, such as those from pedestrians, vehicles, or aircraft, remain unexplored. This is mainly due to the lack of a semantic dataset for the analyzed trajectories. While trajectory datasets for land- or air-based movements are more often open, the contextual data required, such as traffic flow, road maps, or atmospheric conditions, are either unavailable or restricted to a few public repositories. The current study, therefore, focuses on demonstrating the efficiency of CADP in maritime applications using AIS data.

While DTW captures temporal distortions effectively and gives a robust measure for spatial alignment, it does not explicitly assess semantic similarity, reflecting the retention of meaningful contextual transitions or features. As a result, the study may not fully explore the impact of context integration on maintaining semantic fidelity in compressed trajectories.

4. Conclusions

This study introduced a novel Context-Aware Douglas–Peucker (CADP) trajectory compression method that integrates both spatial dimensions and environmental and semantic contextual factors, aiming to enhance the quality and utility of compressed trajectories. Traditional trajectory compression methods, such as the Douglas–Peucker (DP) algorithm, focus on spatial characteristics, often neglecting the influence of critical contextual factors like weather or ocean conditions, which can lead to the loss of valuable information during compression. In contrast, the CADP method addresses these limitations by incorporating external environmental data, including water depth, wave height, and wave direction, into the compression process.

The results demonstrate that the CADP method performs significantly better in maintaining trajectory similarity and improving trajectory prediction accuracy. Including contextual information, such as water depth, led to a Root Mean Square Error (RMSE) of 0.004195 in an LSTM-based trajectory prediction, outperforming uncompressed, DP-compressed, and two-stage-CPLS-compressed datasets. This improvement underscores the importance of integrating context into compression algorithms to preserve moving objects’ behavior and movement patterns, especially in complex environments like maritime navigation.

Moreover, the CADP method maintains traditional DP compression’s computational efficiency with a worst-case O(n²) complexity. This ensures that the method is scalable and effective for large datasets, such as those generated by Automatic Identification Systems (AIS) in maritime traffic monitoring. By preserving critical trajectory points influenced by contextual factors, CADP enhances the dataset’s representativeness without significantly increasing its volume, making it suitable for further analysis in applications such as collision avoidance, vessel route planning, and anomaly detection systems.

Furthermore, experiments show that CADP compresses trajectories while effectively embedding contextual domains like water depth, wave height, and wave direction with negligible effects on scalability and computational efficiency. The algorithm is a practical and dependable extension of the DP algorithm; hence, it can be very suitable for many context-aware applications involving trajectory analytics, such as maritime navigation, environmental monitoring, and road transportation systems.

This study shows how context-aware trajectory compression bridges the gap between spatial accuracy and contextual domains. CADP, therefore, represents a practical, scalable, and high-impact contribution to trajectory analysis: it addresses very real challenges in maritime navigation while opening very promising avenues for extensions into other industries.

Future research should be directed at embedding semantic context, like the activity of moving objects or operational purposes, into trajectory compression and analysis. This dimension extends the CADP algorithm to richer scenarios in which semantic information plays an important role in shaping the behavior of trajectories. It will also enable an investigation into how semantic context influences compression efficiency, semantic similarity, and prediction accuracy, further establishing the adaptability and robustness of the CADP framework.

Moreover, extending the CADP algorithm to more classes of trajectories, such as those generated by the movement of vehicles or aircraft, will allow for greater generalization. To that end, domain-specific contextual datasets will have to be sourced for roads; this may involve accessing a road network, traffic flow information, atmospheric data, or synthetic data that can be created to test various domains. Further support for such an extension can be achieved by collaborating with data providers or generating synthetic contextual datasets to enable a better demonstration of the adaptability of CADP to diverse scenarios.

Future work should explore the effects of spatial autocorrelation and uncertainty in trajectory compression, as these factors can influence both the data integration process and the compression quality. Addressing these challenges will improve the robustness and applicability of context-aware trajectory compression in a broader range of geospatial and transportation systems.

Also, future research should be channeled into developing trade-off curves that can consider how best to balance real-time and offline trajectory processing in shipping. These curves can consider the size of a ship, type of freight, environmental dynamics, and respective costs related to processing or memory. These quantified trade-off curves will give actionable insights into optimizing trajectory generation and compression strategies for the concrete needs of industry stakeholders.

Maritime trajectory generation and analysis are placed in a very special position between the robotics and spacecraft domains. While maritime navigation, like the robotics domain [50,51,52], is concerned with dense environments and constraints related to safety, it also requires energy efficiency in route planning, similar to the spacecraft domain [53,54,55]. On the other hand, unlike a spacecraft, ships need to consider external forces such as waves, currents, and tides [56]. Therefore, tailored approaches like CADP are in order. Thus, CADP embeds environmental and contextual factors and is balanced between spatial accuracy, fuel efficiency, and safety.

Therefore, further research is required to translate trajectory planning methods from robotics and spacecraft to the maritime domain. Strategies for dynamic obstacle avoidance may be adopted from robotics or energy optimization techniques from space. Further, trade-off curves can be drawn on the balance between offline and real-time trajectory processing with respect to ship size, freight type, and environmental conditions. This could provide a cross-domain contribution to improving the robustness and applicability of trajectory planning.

Author Contributions

Conceptualization, Saeed Mehri; methodology, Saeed Mehri; software, Saeed Mehri; validation, Saeed Mehri; formal analysis, Saeed Mehri; investigation, Saeed Mehri; resources, Saeed Mehri; data curation, Saeed Mehri; writing—original draft preparation, Saeed Mehri; writing—review and editing, Saeed Mehri, Navid Hooshangi, and Navid Mahdizadeh Gharakhanlou; visualization, Saeed Mehri; supervision, Saeed Mehri; project administration, Saeed Mehri. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data and codes are available upon request.

Acknowledgments

We appreciate the constructive comments received from the editor and three reviewers on an earlier version of the manuscript. We also sincerely appreciate the publisher for waiving the article processing charge.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zhang, D.; Ding, M.; Yang, D.; Liu, Y.; Fan, J.; Shen, H.T. Trajectory simplification: An experimental study and quality analysis. Proc. VLDB Endow. 2018, 11, 934–946. [Google Scholar] [CrossRef]
Sun, S.; Chen, Y.; Piao, Z.; Zhang, J. Vessel AIS Trajectory Online Compression Based on Scan-Pick-Move Algorithm Added Sliding Window. IEEE Access 2020, 8, 109350–109359. [Google Scholar] [CrossRef]
Wang, Z.; Long, C.; Cong, G.; Jensen, C.S. Collectively Simplifying Trajectories in a Database: A Query Accuracy Driven Approach. In Proceedings of the 2024 IEEE 40th International Conference on Data Engineering (ICDE), Utrecht, The Netherlands, 13–16 May 2024. [Google Scholar]
Zhao, P.; Zhao, Q.; Zhang, C.; Su, G.; Zhang, Q.; Rao, W. CLEAN: Frequent pattern-based trajectory compression and computation on road networks. China Commun. 2020, 17, 119–136. [Google Scholar] [CrossRef]
Basiri, A.; Amirian, P.; Mooney, P. Using Crowdsourced Trajectories for Automated OSM Data Entry Approach. Sensors 2016, 16, 1510. [Google Scholar] [CrossRef]
Liu, J.; Li, H.; Yang, Z.; Wu, K.; Liu, Y.; Liu, R.W. Adaptive Douglas-Peucker Algorithm with Automatic Thresholding for AIS-Based Vessel Trajectory Compression. IEEE Access 2019, 7, 150677–150692. [Google Scholar] [CrossRef]
Hershberger, J.E.; Snoeyink, J. Speeding Up the Douglas-Peucker Line-Simplification Algorithm; Department of Computer Science, University of British Columbia: Vancouver, BC, Canada, 1992. [Google Scholar]
Mehri, S.; Alesheikh, A.A.; Basiri, A. A Contextual Hybrid Model for Vessel Movement Prediction. IEEE Access 2021, 9, 45600–45613. [Google Scholar] [CrossRef]
de Vries, G.K.D.; van Someren, M. Machine learning for vessel trajectories using compression, alignments and domain knowledge. Expert Syst. Appl. 2012, 39, 13426–13439. [Google Scholar] [CrossRef]
Makris, A.; Kontopoulos, I.; Alimisis, P.; Tserpes, K. A Comparison of Trajectory Compression Algorithms Over AIS Data. IEEE Access 2021, 9, 92516–92530. [Google Scholar] [CrossRef]
Zhong, Y.; Kong, J.; Zhang, J.; Jiang, Y.; Fan, X.; Wang, Z. A trajectory data compression algorithm based on spatio-temporal characteristics. PeerJ Comput. Sci. 2022, 8, e1112. [Google Scholar] [CrossRef] [PubMed]
Tienaah, T.; Stefanakis, E.; Coleman, D. Contextual Douglas-Peucker simplification. Geomatica 2015, 69, 327–338. [Google Scholar] [CrossRef]
Lee, W.-C.; Krumm, J. Trajectory Preprocessing. In Computing with Spatial Trajectories; Zheng, Y., Zhou, X., Eds.; Springer: New York, NY, USA, 2011; pp. 3–33. [Google Scholar]
Keogh, E.; Chu, S.; Hart, D.; Pazzani, M. Segmenting time series: A survey and novel approach. In Data Mining in Time Series Databases; World Scientific: Singapore, 2004; pp. 1–21. [Google Scholar]
Cao, H.; Wolfson, O.; Trajcevski, G. Spatio-temporal data reduction with deterministic error bounds. VLDB J.—Int. J. Very Large Data Bases 2006, 15, 211–228. [Google Scholar] [CrossRef]
Douglas, D.H.; Peucker, T.K. Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. Cartographica 1973, 10, 112–122. [Google Scholar] [CrossRef]
Meratnia, N.; de By, R.A. Spatiotemporal Compression Techniques for Moving Point Objects. In Advances in Database Technology—EDBT 2004; Springer: Berlin/Heidelberg, Germany, 2004. [Google Scholar]
Zhang, T.; Wang, Z.; Wang, P. A method for compressing AIS trajectory based on the adaptive core threshold difference Douglas-Peucker algorithm. Sci. Rep. 2024, 14, 21408. [Google Scholar] [CrossRef]
Guo, S.; Bolbot, V.; Valdez Banda, O. An adaptive trajectory compression and feature preservation method for maritime traffic analysis. Ocean. Eng. 2024, 312, 119189. [Google Scholar] [CrossRef]
Zhou, Y.; Zhang, Y.; Zhang, F.; Zhang, Y.; Wang, X. Trajectory Compression with Spatio-Temporal Semantic Constraints. ISPRS Int. J. Geo-Inf. 2024, 13, 212. [Google Scholar] [CrossRef]
Gao, C.; Zhao, Y.; Wu, R.; Yang, Q.; Shao, J. Semantic trajectory compression via multi-resolution synchronization-based clustering. Knowl.-Based Syst. 2019, 174, 177–193. [Google Scholar] [CrossRef]
Long, C.; Wong, R.C.-W.; Jagadish, H.V. Direction-preserving trajectory simplification. Proc. VLDB Endow. 2013, 6, 949–960. [Google Scholar] [CrossRef]
Li, Y.; Li, H.; Zhang, C.; Zhao, Y.; Yang, Z. Incorporation of adaptive compression into a GPU parallel computing framework for analyzing large-scale vessel trajectories. Transp. Res. Part C Emerg. Technol. 2024, 163, 104648. [Google Scholar] [CrossRef]
Buchin, M.; Dodge, S.; Speckmann, B. Similarity of trajectories taking into account geographic context. J. Spat. Inf. Sci. 2014, 2014, 101–124. [Google Scholar] [CrossRef]
Ahearn, S.C.; Dodge, S.; Simcharoen, A.; Xavier, G.; Smith, J.L.D. A context-sensitive correlated random walk: A new simulation model for movement. Int. J. Geogr. Inf. Sci. 2017, 31, 867–883. [Google Scholar] [CrossRef]
Japan P& I Club. Marine Weather Ship Handling in Rough Sea. In P&I Loss Prevention Bulletin; Japan P& I Club: Tokyo, Japan, 2019; Volume 45. [Google Scholar]
Dodge, S.; Bohrer, G.; Weinzierl, R.; Davidson, S.C.; Kays, R.; Douglas, D.; Cruz, S.; Han, J.; Brandes, D.; Wikelski, M. The environmental-data automated track annotation (Env-DATA) system: Linking animal tracks with environmental data. Mov. Ecol. 2013, 1, 3. [Google Scholar] [CrossRef] [PubMed]
Brum-Bastos, V.S.; Long, J.A.; Demšar, U. Weather effects on human mobility: A study using multi-channel sequence analysis. Comput. Environ. Urban Syst. 2018, 71, 131–152. [Google Scholar] [CrossRef]
Andrienko, G.; Andrienko, N.; Bak, P.; Keim, D.; Wrobel, S. Visual Analytics of Movement; Springer: Berlin/Heidelberg, Germany, 2013. [Google Scholar] [CrossRef]
Zheng, Y. Trajectory data mining: An overview. ACM Trans. Intell. Syst. Technol. (TIST) 2015, 6, 29. [Google Scholar] [CrossRef]
Spaccapietra, S.; Parent, C.; Damiani, M.L.; de Macedo, J.A.; Porto, F.; Vangenot, C. A conceptual view on trajectories. Data Knowl. Eng. 2008, 65, 126–146. [Google Scholar] [CrossRef]
Yuan, J.; Zheng, Y.; Xie, X.; Sun, G. T-Drive: Enhancing Driving Directions with Taxi Drivers’ Intelligence. IEEE Trans. Knowl. Data Eng. 2013, 25, 220–232. [Google Scholar] [CrossRef]
Wang, D.; Miwa, T.; Morikawa, T. Big Trajectory Data Mining: A Survey of Methods, Applications, and Services. Sensors 2020, 20, 4571. [Google Scholar] [CrossRef] [PubMed]
Senin, P. Dynamic Time Warping Algorithm Review; Information and Computer Science Department, University of Hawaii at Manoa: Honolulu, HI, USA, 2008; Volume 855, p. 40. [Google Scholar]
Vaughan, N.; Gabrys, B. Comparing and combining time series trajectories using dynamic time warping. Procedia Comput. Sci. 2016, 96, 465–474. [Google Scholar] [CrossRef]
Liu, K.; Li, Y.; Dai, J.; Shang, S.; Zheng, K. Compressing large scale urban trajectory data. In Proceedings of the Fourth International Workshop on Cloud Data and Platforms; Association for Computing Machinery: New York, NY, USA; pp. 1–6.
Zeng, Y.; Yang, J.; Yin, Y. Gaussian Process-Integrated State Space Model for Continuous Joint Angle Prediction from EMG and Interactive Force in a Human-Exoskeleton System. Appl. Sci. 2019, 9, 1711. [Google Scholar] [CrossRef]
Wang, S.; Bao, Z.; Culpepper, J.S.; Sellis, T.; Qin, X. Fast large-scale trajectory clustering. Proc. VLDB Endow. 2019, 13, 29–42. [Google Scholar] [CrossRef]
Agenis-Nevers, M.; Bokde, N.D.; Yaseen, Z.M.; Shende, M.K. An empirical estimation for time and memory algorithm complexities: Newly developed R package. Multimed. Tools Appl. 2021, 80, 2997–3015. [Google Scholar] [CrossRef]
Niu, X.; Zheng, Y.; Fournier-Viger, P.; Wang, B. Parallel grid-based density peak clustering of big trajectory data. Appl. Intell. 2022, 52, 17042–17057. [Google Scholar] [CrossRef]
Graser, A. Movingpandas: Efficient structures for movement data in python. GI_Forum—J. Geogr. Inf. Sci. 2019, 1, 54–68. [Google Scholar] [CrossRef]
Mehri, S.; Alesheikh, A.A.; Basiri, A. A context-aware approach for vessels’ trajectory prediction. Ocean Eng. 2023, 282, 114916. [Google Scholar] [CrossRef]
Boudière, E.; Maisondieu, C.; Ardhuin, F.; Accensi, M.; Pineau-Guillou, L.; Lepesqueur, J. A suitable metocean hindcast database for the design of Marine energy converters. Int. J. Mar. Energy 2013, 3–4, e40–e52. [Google Scholar] [CrossRef]
Tavenard, R.; Faouzi, J.; Vandewiele, G.; Divo, F.; Androz, G.; Holtz, C.; Payne, M.; Yurchak, R.; Rußwurm, M.; Kolar, K. Tslearn, a machine learning toolkit for time series data. J. Mach. Learn. Res. 2020, 21, 4686–4691. [Google Scholar]
Capobianco, S.; Millefiori, L.M.; Forti, N.; Braca, P.; Willett, P. Deep Learning Methods for Vessel Trajectory Prediction Based on Recurrent Neural Networks. IEEE Trans. Aerosp. Electron. Syst. 2021, 57, 4329–4346. [Google Scholar] [CrossRef]
Tang, H.; Yin, Y.; Shen, H. A model for vessel trajectory prediction based on long short-term memory neural network. J. Mar. Eng. Technol. 2019, 21, 136–145. [Google Scholar] [CrossRef]
Gao, M.; Shi, G.; Li, S. Online Prediction of Ship Behavior with Automatic Identification System Sensor Data Using Bidirectional Long Short-Term Memory Recurrent Neural Network. Sensors 2018, 18, 4211. [Google Scholar] [CrossRef] [PubMed]
Alesheikh, A.A.; Mehri, S. Finding Optimal Contextual Parameters for Real-Time Vessel Position Prediction Using Deep Learning. Iran. J. Remote Sens. GIS 2022, 13, 89–100. [Google Scholar] [CrossRef]
Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014. [Google Scholar] [CrossRef]
Kavraki, L.E.; Svestka, P.; Latombe, J.-C.; Overmars, M.H. Probabilistic roadmaps for path planning in high-dimensional configuration spaces. IEEE Trans. Robot. Autom. 1996, 12, 566–580. [Google Scholar] [CrossRef]
LaValle, S.M.; Kuffner, J.J. Rapidly-exploring random trees: Progress and prospects: Steven M. Lavalle, Iowa State University, A James J. Kuffner, Jr., University of Tokyo, Tokyo, Japan. In Algorithmic and Computational Robotics; A K Peters/CRC Press: Natick, MA, USA, 2001; pp. 303–307. [Google Scholar]
Karaman, S.; Frazzoli, E. Sampling-based algorithms for optimal motion planning. Int. J. Robot. Res. 2011, 30, 846–894. [Google Scholar] [CrossRef]
Conway, B.A. Spacecraft Trajectory Optimization; Cambridge University Press: Cambridge, UK, 2010; Volume 29. [Google Scholar]
McConaghy, T.T. Design and Optimization of Interplanetary Spacecraft Trajectories. Ph.D. Thesis, Purdue University, West Lafayette, IN, USA, 2004. [Google Scholar]
Macdonald, M.; McInnes, C.R. Solar sail mission applications and future advancement. In Proceedings of the 2nd International Symposiun on Solar Sailing (ISSS 2010), New York, NY, USA, 20–22 July 2010. [Google Scholar]
Du, W.; Li, Y.; Zhang, G.; Wang, C.; Zhu, B.; Qiao, J. Energy saving method for ship weather routing optimization. Ocean Eng. 2022, 258, 111771. [Google Scholar] [CrossRef]

Figure 1. Study area and AIS messages overlayed on the wiki map base map.

Figure 2. Visualization of the compression ratio and similarity of compressed trajectories with different threshold settings of the CADP: (a,b) compression ratio and similarity of using CADP with distance and wave height thresholds; (c,d) compression ratio and similarity of using CADP with distance and wave direction thresholds; (e,f) compression ratio and similarity of using CADP with distance and bottom depth thresholds.

Figure 3. Original and compressed AIS data set: (a) original AIS points; (b) AIS points after compression using the CADT method.

Figure 4. The runtime analysis of the runtime performance of the DP, two-stage CPLS, and CADP algorithms. Error bars indicate standard deviations.

Figure 5. Runtime comparison between DP and CADP variants with water depth, wave height, and wave direction contexts. This figure focuses on the runtime scalability of DP and CADP while excluding the two-stage CPLS algorithm to scale up the visibility (its runtime was much higher compared with DP and CADP). Error bars indicate standard deviations.

Table 1. Analyzing the effects of critical thresholds of wave direction on variations in compression ratios and the similarity of compressed trajectories.

No	ϵ_max (Degree)	R (%)	dtw	Change (to the Previous Step)
No	ϵ_max (Degree)	R (%)	dtw	R (%)	dtw (%)
1	1	70.47	236,988	--	---
2	1.5	72.15	303,402	1.68	28.02
3	2	72.84	524,845	0.69	72.99
4	2.5	73.25	524,847	0.41	0.0004

Table 2. Prediction results of LSTM using different compressed datasets.

Input Features	Validation RMSE
Spatial (Uncompressed)	0.048789
Spatial (Compressed by DP)	0.012226
Spatial + Context (Compressed by two-stage CPLS)	0.004786
Spatial + Water depth (Compressed by CADP)	0.004195
Spatial + wave height (Compressed by CADP)	0.005792
Spatial + Water direction (Compressed by CADP)	0.004738

Table 3. Runtime analysis and standard deviations for the DP and CADP algorithms with various contextual parameters (water depth, wave height, and wave direction) across different trajectory sizes. All numbers are milliseconds.

Trajectory Size (n)	DP		CADP		CADP		CADP		Two-Stage CPLS
	DP		(Spatial + Water Depth)		(Spatial + Wave Height)		(Spatial + Wave Direction)		Two-Stage CPLS
	Runtime	Std	Runtime	Std	Runtime	Std	Runtime	Std	Runtime	Std
1000	14	1.0	16	1.1	17	1.3	17	1.4	32	2.3
5000	66	2.5	72	2.8	74	3.0	75	3.2	253	9.3
10,000	150	5.0	160	5.2	165	6.1	166	6.3	916	25.7
50,000	778	18.0	812	20.3	835	21.5	840	22.1	16,237	141.3
100,000	3190	85.0	3290	88.5	3340	92.2	3355	95.3	127,315	483.7

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Published by MDPI on behalf of the International Society for Photogrammetry and Remote Sensing. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Mehri, S.; Hooshangi, N.; Mahdizadeh Gharakhanlou, N. A Novel Context-Aware Douglas–Peucker (CADP) Trajectory Compression Method. ISPRS Int. J. Geo-Inf. 2025, 14, 58. https://doi.org/10.3390/ijgi14020058

AMA Style

Mehri S, Hooshangi N, Mahdizadeh Gharakhanlou N. A Novel Context-Aware Douglas–Peucker (CADP) Trajectory Compression Method. ISPRS International Journal of Geo-Information. 2025; 14(2):58. https://doi.org/10.3390/ijgi14020058

Chicago/Turabian Style

Mehri, Saeed, Navid Hooshangi, and Navid Mahdizadeh Gharakhanlou. 2025. "A Novel Context-Aware Douglas–Peucker (CADP) Trajectory Compression Method" ISPRS International Journal of Geo-Information 14, no. 2: 58. https://doi.org/10.3390/ijgi14020058

APA Style

Mehri, S., Hooshangi, N., & Mahdizadeh Gharakhanlou, N. (2025). A Novel Context-Aware Douglas–Peucker (CADP) Trajectory Compression Method. ISPRS International Journal of Geo-Information, 14(2), 58. https://doi.org/10.3390/ijgi14020058

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Novel Context-Aware Douglas–Peucker (CADP) Trajectory Compression Method

Abstract

1. Introduction

2. Materials and Methods

2.1. Douglas–Peucker Algorithm

2.2. Context-Aware Trajectory Formulation

2.3. Context-Aware Douglas–Peucker Algorithm

2.4. Performance Evaluation

3. Results and Discussion

3.1. Datasets and Preprocesses

3.2. Effects of Context on Compression Ratio and Similarity of Trajectories

3.3. Effects of CADP on Trajectory Prediction

3.4. Computational Efficiency Analysis

3.5. Limitations

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI