Deep Learning-Enhanced Sampling-Based Path Planning for LTL Mission Specifications

Baek, Changmin; Cho, Kyunghoon

doi:10.3390/s24102998

Open AccessArticle

Deep Learning-Enhanced Sampling-Based Path Planning for LTL Mission Specifications

by

Changmin Baek

and

Kyunghoon Cho

^*

Department of Information and Telecommunication Engineering, Incheon National University, Incheon 22012, Republic of Korea

^*

Author to whom correspondence should be addressed.

Sensors 2024, 24(10), 2998; https://doi.org/10.3390/s24102998

Submission received: 20 March 2024 / Revised: 2 May 2024 / Accepted: 3 May 2024 / Published: 9 May 2024

(This article belongs to the Topic Advances in Mobile Robotics Navigation, 2nd Volume)

Download

Browse Figures

Versions Notes

Abstract

:

The presented paper introduces a novel path planning algorithm designed for generating low-cost trajectories that fulfill mission requirements expressed in Linear Temporal Logic (LTL). The proposed algorithm is particularly effective in environments where cost functions encompass the entire configuration space. A core contribution of this paper is the presentation of a refined approach to sampling-based path planning algorithms that aligns with the specified mission objectives. This enhancement is achieved through a multi-layered framework approach, enabling a simplified discrete abstraction without relying on mesh decomposition. This abstraction is especially beneficial in complex or high-dimensional environments where mesh decomposition is challenging. The discrete abstraction effectively guides the sampling process, influencing the selection of vertices for extension and target points for steering in each iteration. To further improve efficiency, the algorithm incorporates a deep learning-based extension, utilizing training data to accurately model the optimal trajectory distribution between two points. The effectiveness of the proposed method is demonstrated through simulated tests, which highlight its ability to identify low-cost trajectories that meet specific mission criteria. Comparative analyses also confirm the superiority of the proposed method compared to existing methods.

Keywords:

deep learning-based control synthesis; formal methods; mission-based path planning; sampling-based path planning

1. Introduction

Path planning is a critical element in robotics, evolving from its initial focus on straightforward two-dimensional tasks to addressing more complex challenges involving advanced robotic systems. This evolution is well-documented in the research literature, including studies such as [1,2,3,4,5,6,7,8,9], showcasing the field’s shift towards intricate scenarios like those involving robot manipulators.

Today, path planning goes beyond simple obstacle navigation to involve robots in multifaceted missions, which might include tasks like area coverage or sequential target visits. These complex missions require sophisticated algorithms that not only plan paths but also coordinate multi-robot operations, as discussed in [10,11,12,13,14].

In the realm of mission-driven path planning, the process often begins with defining the mission type, typically focusing on coverage. Comprehensive algorithms are then developed to integrate these mission specifications effectively with path planning strategies. A significant challenge is converting missions, often described in natural language, into forms that are compatible with path planning algorithms. This necessity has led to the increased use of formal methods like Linear Temporal Logic (LTL), Computation Tree Logic (CTL), and

μ

-calculus to precisely define robot mission requirements [15,16,17]. LTL, in particular, has become prominent for its ability to clearly articulate robotic objectives [18,19,20,21]. For instance, LTL formulas like

ϕ = ◊ (A) \land ◊ (B)

explicitly denote tasks such as delivering items to region (A) and collecting items from region (B). This specification showcases the utility of LTL in formulating simple yet explicit mission objectives. Furthermore, the flexibility of LTL allows for the expression of more complex tasks by combining various operators, thereby accommodating a broader range of mission requirements and enhancing the adaptability of robotic systems to diverse operational contexts.

Moreover, path planning is increasingly tasked with not just finding feasible paths but also optimizing them in terms of various cost considerations. For example, optimizing a robot’s trajectory to maintain strong communication in varying signal environments illustrates the importance of cost-effective routing, where the optimal path minimizes time in low-signal areas. This scenario highlights the ongoing challenge of balancing cost efficiency with complex mission fulfillment, especially when integrating sophisticated mission specifications like LTL formulas into dynamic environments.

Computational efficiency is another critical aspect of path planning. Sampling-based motion planning methods, which are prevalent in practical applications, often require extensive sampling to find viable solutions. These methods, marked by thorough exploration of the state space, demand numerous samples, particularly in low-cost path search algorithms like

{RRT}^{*}

. The complexity of state spaces and collision constraints adds significant computational burden to the sampling process. To enhance planning efficiency, machine-learning techniques have been incorporated into sampling-based motion planning algorithms [22,23,24]. These approaches utilize data-driven learning to reduce redundant sampling, thereby facilitating the more efficient discovery of solution paths.

This paper addresses the challenge of developing cost-aware path planning strategies within the framework of dynamic constraints and Linear Temporal Logic (LTL) specifications. Our primary contribution lies in enhancing the efficiency of the search process to identify low-cost trajectories that not only fulfill LTL mission specifications but also optimize quantitative performance metrics across a continuous configuration space. The use of LTL allows us to enforce qualitative mission requirements, ensuring that all solutions adhere to specified behavioral constraints. Simultaneously, our optimization techniques focus on minimizing tangible costs such as travel distance, time, or energy, thereby integrating qualitative mission goals with quantitative optimization objectives. This dual approach ensures that our path planning solutions are both compliant with mission-specific requirements and optimized for operational efficiency.

We employ syntactically co-safe LTL formulas to define mission objectives, calculating trajectory costs as an aggregate of expenses incurred throughout the entire path. To address this complex challenge, we utilize a sampling-based path planning methodology that integrates specified LTL conditions into the search tree expansion, consistently aligning the process with the LTL-driven mission.

Building upon the foundation laid in our previous research [25], we have further refined the search tree expansion method, as illustrated in Figure 1. Maintaining the dual-layered structure from [25], our strategy employs high-level and low-level layers to ascertain solution trajectories that meet both LTL specifications and cost-optimization criteria. The high-level layer ensures trajectory compliance with LTL specifications by directing the search in the low-level layer, optimized to discover low-cost pathways through a rewiring procedure inspired by

{RRT}^{*}

[26]. This setup facilitates the continuous refinement of tree information for ongoing adjustments.

The innovation in our methodology manifests in two major improvements over [25]: the adoption of a minimal abstraction approach in the high-level layer, eliminating the need for mesh decomposition, and the integration of a meticulously fine-tuned SEQ2SEQ-CVAE deep neural network. This network significantly enhances the efficiency of identifying low-cost trajectories, marking a substantial advancement in robotic path planning.

Furthermore, to address the necessity of treating the path planning problem as an integrated task rather than separate subtasks, we consider the interdependencies and dynamics of the trajectory between points. By optimizing the path in an integrated manner, we ensure that constraints such as cost, time, and the sequence of operations are holistically considered. This often leads to solutions that are not only globally optimal but also more robust to changes in the environment. This approach is particularly advantageous in real-world applications where conditions may change unpredictably, requiring a flexible and adaptive path planning strategy. The integrated method facilitates real-time adjustments and optimizations, critical in dynamic settings where separate optimization of subtasks might fail to account for evolving operational contexts.

Figure 2 illustrates a solution trajectory generated by our proposed method in a simulated test environment. This trajectory is designed to comply with the requirements of the Linear Temporal Logic (LTL) formula

ϕ = ◊ (a \land ◊ (b))

, which dictates a sequential visit to specified regions a and b. The method dynamically expands the search tree to discover an optimal path that satisfies this LTL constraint. The accompanying costmap visually differentiates between low-cost areas (depicted in blue) and high-cost zones (shown in yellow), guiding the trajectory to navigate through cost-efficient regions while meeting the mission’s objectives.

Extensive simulation testing confirms the effectiveness of the proposed method in efficiently discovering solutions and minimizing costs. The results demonstrate a significant improvement in performance compared to existing methodologies.

2. Related Work

Path planning in robotics and autonomous systems is a diverse field, addressing the complexities of navigating through environments with varying dynamics and specific mission requirements. This section reviews key contributions and emerging trends that directly inform our approach to low-cost, dynamic path planning under Linear Temporal Logic (LTL) specifications.

Finite deterministic systems: Research in this area often employs LTL to articulate constraints and objectives, focusing on cost optimization within given mission parameters [27,28]. These works have laid foundational approaches for integrating cost functions into path planning. However, their application to continuous path planning is limited by the reliance on discrete models, which struggle to capture the full dynamics of robotic movement. Our study builds upon these foundational concepts to address the continuous nature of robotic path environments.

Sampling-based motion planning: Methods such as Rapidly-exploring Random Trees (RRT) and its variants [29,30,31] have significantly advanced the handling of complex dynamics within path planning. The scalability issues of constructing product automata in these methods [32] highlight a critical bottleneck that we aim to address by integrating more efficient computational techniques and advanced logic formulations in our proposed method.

Optimization methods: Techniques transforming LTL specifications into mixed-integer constraints [33,34] have structured the search for optimal paths under strict logical constraints. Our research extends these methodologies by enhancing the computational efficiency and scalability of path planning algorithms, crucial for dealing with complex environments laden with obstacles.

Multi-layered frameworks: Multi-layered frameworks [25,35,36,37] represent a significant advancement in path planning by integrating discrete abstraction with sampling-based methods. These frameworks facilitate the identification of feasible trajectories that satisfy complex logic constraints, such as those imposed by Linear Temporal Logic (LTL). Our proposed method builds upon these frameworks to enhance the efficiency and effectiveness of path planning in dynamic environments, where adaptability and compliance with LTL specifications are crucial.

Learning from demonstration (LfD) approaches: Learning from demonstration has emerged as a powerful approach to path planning, particularly within the context of autonomous systems [38,39,40,41]. By integrating machine learning techniques with rule-based systems, LfD enables the derivation of robust and interpretable policies that align with predefined safety and operational specifications. Our approach leverages these advancements to refine the path planning process, ensuring that it not only meets technical and safety criteria but also adapts to real-world complexities and learning from actual operational data.

3. Preliminaries

This section introduces the basic concepts and system models essential for understanding our path planning approach, focusing on the dynamics and control mechanisms relevant to autonomous systems.

3.1. System Model

To define the system model, the following notations are introduced:

$X \subset R^{n}$ : The state space of the system.
$X_{o b s} \subset R^{n}$ : The obstacle space.
$X_{f r e e} = X ∖ X_{o b s}$ : The free space.
$U \subset R^{m}$ : The control space.
$W \subset R^{n_{w}}$ : The workspace.
$h : X \to W$ : The mapping function.

The dynamic model of the system is described by the following equation:

{\dot{x}}_{t} = f (x_{t}, u_{t}),

(1)

where

x_{t} \in X_{f r e e}

specifically represents a state within the navigable subset of the overall state space

X

that is free from obstacles. Here,

u_{t} \in U

denotes the control input, and f is a function that is smooth and continuously differentiable, mapping the current state and control input to the rate of change of the state.

The trajectory

x

of the system, starting from the initial state

x_{0}

under a given control signal

u

over the time interval

[0, T]

, is denoted by

x (x_{0}, u)

. This trajectory represents a sequence of states as a function of time, capturing the system’s evolution from

x_{0}

under the influence of

u

.

To facilitate computational handling and simulation, the trajectory is often discretized. The discretized version of this trajectory, denoted by

x_{Δ t} (x_{0}, u)

, is defined as

x_{Δ t} (x_{0}, u) = {x (x_{0}, u, i Δ t)}_{i = 0}^{i_{f}},

(2)

where

Δ t

represents the time step, chosen from the positive real numbers

R^{+}

, and

i_{f}

is the final index in the discretization, an integer from

N

. Each element in this sequence,

x (x_{0}, u, i Δ t)

, corresponds to the state of the system at discrete time intervals

i Δ t

, providing a vector representation of the state at each sampled point.

3.2. Linear Temporal Logic (LTL)

Linear temporal logic (LTL) is a logical formalism suited for specifying linear time characteristics [30]. LTL formulas consist of atomic propositions (APs), Boolean operators, and temporal operators. APs are statements that can be either true or false. The LTL operators include ◯ (next), U (until), □ (always), ◊ (eventually), and ⇒ (implication). LTL formulas follow specific grammatical rules [42].

Π = {π_{0}, π_{1}, \dots, π_{N}}

stands for a set of atomic propositions. A trace

σ

is a sequence of atomic propositions, and LTL semantics are defined over infinite traces. Given a trace

σ

, the notation

σ ⊧ ϕ

signifies that

σ

satisfies the formula

ϕ

.

This study concentrates on path planning within a finite time horizon, with a specific focus on syntactically co-safe LTL formulas (sc-LTL) [43]. Sc-LTL represents a subset of Linear Temporal Logic (LTL) formulas, characterized by certain constraints. An LTL formula

ϕ

is considered co-safe if any infinite trace that satisfies

ϕ

also has a finite prefix fulfilling

ϕ

. In this context, all temporal logic formulas discussed and utilized adhere strictly to the sc-LTL criterion. This approach ensures that the path planning solutions developed are not only compliant with the temporal logic requirements but also suitable for finite time horizon scenarios, a crucial aspect for practical applications in robotics and autonomous systems.

3.2.1. Automaton Representation

In the realm of automaton representation, the study engages with nondeterministic finite automata (NFA) construction from a set of atomic propositions

Π

and syntactically co-safe LTL formulas [44]. For illustration, considering a syntactically co-safe LTL formula such as

ϕ = ◊ (a \land ◊ (b \land ◊ (c)))

, an NFA can be constructed, as depicted in a corresponding figure in the paper. The NFA, while instrumental, can be transformed into a deterministic finite automaton (DFA) for computational efficiency.

A DFA is formalized as

A_{ϕ} = (Q, Σ, δ, q_{i n i t}, Q_{a c c})

, where the components are defined as follows:

Q: A finite set of states.
$Σ = 2^{Π}$ : A finite alphabet.
$δ : Q \times Σ \to Q$ : The transition relation.
$q_{i n i t} \subseteq Q$ : The set of initial states.
$Q_{a c c} \subseteq Q$ : The set of accepting states.

In this context, an infinite trace

σ

of a DFA is deemed accepting if its prefix eventually intersects with one of the accepting states in the DFA (

σ_{i} \cap Q_{a c c} \neq \emptyset

). Therefore, a trajectory

x_{Δ t} (x_{0}, u)

is considered to satisfy the LTL formula

ϕ

if its corresponding trace is accepted by the DFA

A_{ϕ}

. Figure 3 illustrates the NFA representing the syntactically co-safe LTL formula

ϕ

, showcasing its structure consisting of four states and the transition relations for input alphabets.

3.2.2. LTL Semantics over Trajectories

In the workspace

W

, define regions of interest as

P = P_{1}, \dots, P_{n}

. Correspondingly, let each atomic proposition

π_{j} \in Π

be uniquely associated with a region of interest

P_{j}

. We employ a labeling function

L : W \to 2^{Π}

to map points within the workspace to the atomic propositions that hold true at those locations. For any

π_{i} \in Π

, the negated proposition

\neg π_{i}

is valid at any workspace point w where

π_{i}

is not in the label set

L (w)

. Notably,

π_{0}

holds true universally, except within designated regions of interest and obstructed spaces.

Given a control input

u

over a discrete time sequence, the trajectory

x_{Δ t} (x_{0}, u) = x_{0}, x_{1}, \dots, x_{m}

starting from

x_{0}

and progressing at intervals

Δ t

, can be represented as a sequence of labels:

\begin{matrix} trace (x_{Δ t} (x_{0}, u)) = & L (h (x_{0})), L (h (x_{1})), \dots, L (h (x_{m})) . \end{matrix}

(3)

The corresponding automaton states for a trajectory trace,

t r a c e (x_{Δ t} (x_{0}, u)) = τ_{0}, τ_{1}, \dots, τ_{m}

, can be constructed by applying the transition function

δ

as follows:

q_{k} = \{\begin{matrix} δ (q_{i n i t}, τ_{0}), & if k = 0 \\ δ (q_{k - 1}, τ_{k}), & if k \geq 1 . \end{matrix}

(4)

A trajectory is considered compliant with the LTL formula

ϕ

, denoted by

x_{Δ t} (x_{0}, u) ⊧_{Δ t} ϕ

, if it leads to an automaton state within the accepting set

Q_{a c c}

.

4. Proposed Method

The accumulated cost

J (x_{0}, u)

represents the line integral of the cost function c along a trajectory and is defined as follows:

J (x_{0}, u) = \frac{1}{T} \int_{0}^{T} c (x (x_{0}, u, t)) d t,

(5)

where

c : X \to R^{+}

is a continuous and bounded cost function,

u

is a control signal from

t = 0

to

t = T

, and

x_{0}

is the initial state. The mission is defined using a syntactically co-safe LTL formula, where each atomic proposition corresponds to a region of interest.

This study introduces a refined trajectory planning method designed to minimize costs and fulfill Linear Temporal Logic (LTL) mission specifications. This method emphasizes the integration of dynamic models and cost functions, a departure from traditional path planning approaches that often overlook such comprehensive considerations.

By adopting and enhancing the Rapidly exploring Random Tree (RRT) algorithm, this method improves the selection and expansion of search tree vertices. This not only optimizes the efficiency of path planning but also ensures cost-effectiveness. The innovation extends established two-layered path planning strategies, drawing from the strengths of models cited in the literature [25,35,45,46], and enhances them to suit modern, complex scenarios.

A critical advancement of our proposed method is its move away from traditional mesh decomposition techniques, while effective in less complex environments, mesh decomposition can become a bottleneck in more complicated settings. Our method simplifies the planning process with a discrete abstraction approach, eliminating the need for mesh decomposition and consequently providing greater adaptability to varied and challenging environments.

An integral advancement of our method is the innovative application of deep learning within the path planning process. We utilize a Conditional Variational Autoencoder (CVAE) [47], augmented by a SEQ2SEQ structure [48], which together form a powerful data-driven approach to identify cost-efficient path segments. This combination enables a strategic and biased sampling strategy that significantly streamlines the path discovery process. The incorporation of this deep learning mechanism, which is inspired by and expands upon previous research [22,25], adheres to the core tenets of sampling-based planning. It concurrently introduces marked enhancements in efficiency by learning complex patterns and dependencies in path cost data, facilitating the generation of solutions that are both cost-effective and aligned with stringent mission specifications.

The advantages of the proposed method include the following:

Enhanced efficiency: This method’s focus on optimizing vertex selection and extension in the search tree leads to more efficient trajectory planning.
Flexibility in complex environments: By eliminating the need for mesh decomposition, this method is more adaptable to intricate and high-dimensional environments.
Improved cost-effectiveness: The deep learning component allows for a more targeted search for low-cost paths, aligning closely with mission requirements.
Scalability: The use of a deep learning framework offers scalability and adaptability to a range of different path planning scenarios and constraints.

Overall, this proposed method marks a substantial progression in the field of path planning. It offers a robust, efficient, and adaptable framework for navigating complex environments, meeting detailed mission specifications, and optimizing path costs, thereby setting a new benchmark for future trajectory planning endeavors.

4.1. Minimal Discrete Abstraction

Discrete abstraction serves to construct a simplified representation of the problem, focusing on environmental geometry, such as obstacles. Its primary function is to facilitate the search tree’s compliance with the specified LTL. The abstraction utilizes a straightforward graph,

D = (R_{d}, E_{d})

, which delineates the geometric structure of both regions of interest P and non-obstacle zones that are also not regions of interest, denoted W. Vertices are defined as

R_{d} = W \cup P

, while edges are represented by

E_{d} = {(r_{i}, r_{j}) : r_{i}, r_{j} \in R_{d} and are adjacent}

.

The proposed method takes the cross product of the graph

D

with the DFA of the syntactically co-safe LTL formula

ϕ

, resulting in a minimal discrete abstraction. This abstraction differs only marginally in size from the DFA, primarily due to the deliberate avoidance of mesh decomposition. Following the establishment of this abstraction, discrete planning is employed to generate discrete plans. These plans comprise a sequence of discrete states, each of which is represented as a pair

〈 q, r 〉

, with

q \in A_{ϕ} . Q

and

r \in R_{d}

. This approach provides a streamlined and efficient means to navigate the planning space while adhering to the designated mission criteria.

4.2. Search Tree

The central data structure of the proposed algorithm is a search tree, denoted as

T

, which comprises vertices

T . V

and edges

T . E

. The root of this tree,

v_{i n i t}

, represents the initial state of the system. Each vertex within the tree contains crucial planning information, encapsulating various aspects of the robot’s status. These include the robot’s state (expressed as

v . x

), the discrete region it occupies (

v . r

), the current state of the automaton (

v . q

), and the accumulated trajectory cost from the initial state

v_{i n i t}

to the current vertex (

v . c

).

Edges within the tree provide insights into the control inputs and time intervals required to transition between two vertices,

v_{i}

and

v_{j}

. The algorithm manages vertices with identical high-level states by grouping them into sets based on their automaton state and discrete region. This grouping is mathematically represented as follows:

Γ_{〈 q^{'}, r^{'} 〉} = {v \in T . V | v . q = q^{'}, v . r = r^{'}} .

(6)

Such organization within the search tree allows for an efficient and structured approach to planning, ensuring that the algorithm systematically navigates through the planning space while conforming to the specified LTL mission specifications and dynamic constraints.

4.3. Algorithm

The algorithm is described in Algorithm 1. It takes the following as inputs:

DFA of a syntactically co-safe LTL formula, denoted as $A_{ϕ}$ .
Workspace information, $D = (R_{d}, E_{d})$ .
Robot’s initial state, $x_{i n i t}$ .
Initial automaton state, $q_{i n i t}$ .

The search tree, symbolized as

T

, begins with the initialization of

x_{i n i t}

and

q_{i n i t}

. This approach stands out from traditional two-layered path planning methods by determining discrete plans prior to the main iteration rather than during it. The advantage of this early search is derived from employing a minimal discrete abstraction.

Discrete plans, represented as

Σ_{P}

and compliant with the LTL, are generated from the initial discrete state. These plans are obtained using Dijkstra’s algorithm on the product space

A_{ϕ} . Q \times D

, with edge costs uniformly set to 1. This uniform cost assignment ensures that while the discrete plans steer the search tree in accordance with the LTL specifications, they do not inherently prioritize low-cost trajectories.

As the iterative process unfolds, the search tree

T

continues to expand. Each iteration involves the algorithm selecting a starting discrete state,

〈 q_{s}, r_{s} 〉

, based on the current configuration of the search tree. Subsequently, a target discrete state is identified from the set of discrete plans

Σ_{P}

and the selected starting state. The expansion of the tree

T

then proceeds towards the targeted discrete region

r_{t}

. Upon the completion of these iterations, the algorithm identifies the optimal trajectory that fulfills the LTL formula

ϕ

as the final solution. Further sections of the paper delve into more detailed descriptions of each of these steps, providing a comprehensive understanding of the process and its underlying mechanisms.

Algorithm 1 Proposed method

1:: $T . V \leftarrow {x_{i n i t}, q_{i n i t}}$
2:: $T . E \leftarrow \emptyset$
3:: $Σ_{P} \leftarrow$ DiscretePlanning( $A_{ϕ}, D, 〈 q_{i n i t}, r_{i n i t} 〉$ )
4:: while $t i m e () < t m a x$ do
5:: $〈 q_{s}, r_{s} 〉 \leftarrow$ GetInitialState( $T$ )
6:: $〈 q_{t}, r_{t} 〉 \leftarrow$ GetTargetState( $T, Σ_{P}, 〈 q_{s}, r_{s} 〉$ )
7:: $T \leftarrow$ ExtendTree( $A_{ϕ}, T, Γ_{〈 q_{s}, r_{s} 〉}, r_{t}$ )
8:: end while
9:: if $\exists v_{i} \in T . V$ s.t. $v_{i} . q \in A_{ϕ} . Q_{a c c}$ then
10:: return $Ξ \leftarrow$ SelectMinTraj( $T$ )
11:: else
12:: return $n u l l$
13:: end if

4.3.1. Selection of an Initial State

The function GetInitialState chooses an initial discrete state

〈 q_{s}, r_{s} 〉

for expansion. This choice is influenced by the exploration data from the search tree, assigning weights to each discrete state. The weight of a discrete state

〈 q, r 〉

is formulated as follows:

w (q, r) \propto \frac{Area (r)}{(1 + | Γ_{〈 q, r 〉} {|)}^{α}},

(7)

where Area(r) is the area of region r, and

| Γ_{〈 q, r 〉} |

signifies the number of vertices linked to

〈 q, r 〉

. The positive real number

α

adjusts the effect of the vertex count. Building upon prior research [25,35,45,46], this approach has been modified for the main problem of this paper.

States are chosen probabilistically from the set of discrete states that includes the search tree’s vertices. The selection probability is directly proportional to

w (q, r)

, favoring larger discrete regions and states less explored previously.

4.3.2. Selection of a Target State

The GetTargetState function determines the target discrete state for pursuit, aiming to steer the search tree according to the given LTL specification or to inspect other regions. This study employs two methods for target state selection:

Transitioning to the next automaton state: Initially, a discrete plan $σ_{P}$ from the plan set $Σ_{P}$ that includes the starting state $〈 q_{s}, r_{s} 〉$ is selected. This plan helps identify the target state $〈 q_{t}, r_{t} 〉$ to proceed to the next automaton state. If $σ_{P}$ is defined as

$σ_{P} = 〈 q_{i n i t}, r_{i n i t} 〉, 〈 q_{1}, r_{1} 〉, \dots 〈 q_{M}, r_{M} 〉,$

(8)

and the i-th element in $σ_{P}$ is the starting state $〈 q_{i}, r_{i} 〉 = 〈 q_{s}, r_{s} 〉$ , then $σ_{P, i}$ is the subset from the i-th index:

$σ_{P, i} = 〈 q_{i + 1}, r_{i + 1} 〉, \dots 〈 q_{M}, r_{M} 〉 .$

(9)

From this subset $σ_{P, i}$ , we select the target state $〈 q_{t}, r_{t} 〉$ . The criteria for this selection are as follows: $r_{t}$ should belong to the regions of interest (i.e., $r_{t} \in {P_{1}, \dots, P_{n}}$ ), and $q_{t}$ should be the first automaton state that is different from $q_{s}$ .
Random state selection: The target state is chosen randomly from the set

${〈 q, r 〉 | q = q_{s}, r \in R_{d}},$

(10)

enabling the tree to explore within the automaton state $q_{s}$ .

The first strategy prioritizes alignment with the LTL specification, whereas the second examines all possible states. A random choice determines the employed strategy: the first has a chance of

p_{D} \in [0, 1)

, while the second has a probability of

1 - p_{D}

.

4.3.3. Search Tree Expansion

The tree expansion algorithm, as outlined in Algorithm 2, builds upon the foundations established in previous studies [25]. The novel contribution of this work lies in the integration of a deep learning network. This network is specifically tailored to identify control sequences that effectively direct the path towards the target point, while simultaneously considering the cost configurations involved. This integration of deep learning into the tree expansion process represents a significant advancement, enhancing the algorithm’s ability to navigate complex environments and achieve cost-effective trajectories in alignment with the predefined LTL mission criteria.

The process begins by uniformly selecting a target point

p_{t}

from

r_{t}

(line 1). From the tree partition

Γ_{s}

, the initial vertex

v_{s}

is either randomly chosen from the vertices of

T

in

Γ_{s}

with a probability

p_{T} \in [0, 1)

(line 2) or the vertex nearest to

p_{t}

is designated as

v_{s}

with probability

1 - p_{T}

. A path is then sought, initiating from

v_{s}

and moving towards

p_{t}

.

Algorithm 2 ExtendTree(

A_{ϕ}, T, Γ_{s}, r_{t}

)

1:: $p_{t} \leftarrow$ SelectTargetPoint( $r_{t}$ )
2:: $v_{s} \leftarrow$ SelectInitialVertex( $T, Γ_{s}, p_{t}$ )
3:: if $∥ v_{s} . x - p_{t} ∥$ $> δ_{d}$ then
4:: $ζ_{s e g} \leftarrow DL_Extend (v_{s} . x, p_{t})$
5:: else
6:: $ζ_{s e g} \leftarrow Steer (v_{s} . x, p_{t})$
7:: end if
8:: if CollisionFree( $ζ_{s e g}$ ) then
9:: $T \leftarrow$ UpdateTree( $A_{ϕ}, T, v_{s}, ζ_{s e g}$ )
10:: end if
11:: return $T$

If the distance between

v_{s} . x

and

p_{t}

surpasses a threshold

δ_{d}

, the deep-learning-based DL_Extend function is employed to discover a cost-efficient path from

v_{s} . x

to

p_{t}

(line 4). Further details of DL_Extend are elucidated in the subsequent section. Alternatively, the familiar Steer function from sampling-based motion planning algorithms [26] is utilized. Once the generated path segment

ζ_{s e g}

, initiating from

v_{s} . x

and targeting

p_{t}

, is verified as valid, it is incorporated into the search tree

T

(line 9).

Algorithm 3 presents the UpdateTree function, which plays a crucial role in the tree expansion algorithm. In this context,

Q_{f e a s}

represents the set of automaton states that have the capability to reach accepting automaton states. A key aspect of this function is the integration of the rewiring procedure from the

{RRT}^{*}

algorithm. This procedure is applied to refine the cost associated with a vertex in

T

after evaluating neighboring vertices. Through this iterative process of refinement, the algorithm effectively converges towards a solution that is near-optimal. This approach enhances the algorithm’s efficiency and accuracy in determining cost-effective paths that align with the specified LTL mission requirements.

Each node in the path segment

ζ_{s e g}

undergoes the UpdateVertex operation, refreshing information for the newly appended vertex (line 3): the robot’s state

x_{i}

, the decomposed region

Υ_{d} (h (x_{i}))

, the automaton state

A_{ϕ} . δ (v_{p a r e n t} . q, L (h (x_{i})))

, and the trajectory cost from the initiating vertex to the current one

v_{p a r e n t} . c + \cos t (edge (v_{p a r e n t}, v_{i}))

. The iteration is terminated if the automaton state of the new vertex is not part of the feasible automaton states

Q_{f e a s}

(line 4).

The function

N e a r (x)

yields a set of vertices,

V_{n e a r}

, in

T

that reside within a radius

ρ = γ {(log n / n)}^{1 / d}

centered at

h (x)

. n is the vertex count in

T

and d denotes the state dimension. A sufficient

γ

value ensures the detection of a practicable number of rewiring candidates.

The function FeasibilityCheck within the algorithm plays a pivotal role in verifying the legitimacy of connecting two vertices in the search tree. To establish a legitimate connection between any two vertices, the algorithm requires the fulfillment of two key conditions: first, the edge connecting the vertices must be free of any collisions, and second, there must be a match in the automaton state between the computed vertex and the existing vertex in the tree.

This rigorous process of verification and comparison incrementally refines the solution, ensuring its alignment with the provided Linear Temporal Logic (LTL) specification. This alignment is critical for guiding the algorithm towards an optimal solution. Specifically, lines 11–16 of the algorithm are dedicated to determining the minimum cost path for every newly added vertex. Following this, the rewiring procedure, which is an integral part of the

{RRT}^{*}

algorithm, is executed between lines 19–26. The implementation of these steps within the FeasibilityCheck function is crucial for maintaining the algorithm’s integrity and effectiveness in finding a feasible and cost-efficient path that adheres to the specified mission criteria.

Algorithm 3 UpdateTree(

A_{ϕ}, T, v_{s}, ζ_{s e g}

)

1:: $v_{p a r e n t} = v_{s}$
2:: for $x_{i} \in ζ_{s e g} : {x_{i} \in X}$ do
3:: $v_{i} \leftarrow UpdateVertex (v_{p a r e n t}, x_{i}, A_{ϕ})$
4:: if $v_{i} . q \notin Q_{f e a s}$ then
5:: break;
6:: end if
7:: $T . V \leftarrow T . V \cup {v_{i}}$
8:: $v_{m i n} \leftarrow v_{p a r e n t}$
9:: $c_{m i n} \leftarrow v_{i} . c + \cos t (edge (v_{m i n}, v_{i}))$
10:: $V_{n e a r} \leftarrow Near (v_{i} . x)$
11:: for each $v_{n e a r} \in V_{n e a r} ∖ {v_{p a r e n t}}$ do
12:: $c_{n e a r} \leftarrow v_{n e a r} . c + \cos t (edge (v_{n e a r}, v_{i}))$
13:: if $FeasibilityCheck (A_{ϕ}, v_{n e a r}, v_{i}) \land c_{n e a r} < c_{m i n}$ then
14:: $v_{m i n} \leftarrow v_{n e a r}, c_{m i n} \leftarrow c_{n e a r}$
15:: end if
16:: end for
17:: $T . E \leftarrow T . E \cup {(v_{m i n}, v_{i})}$
18:: $V_{n e a r}^{'} \leftarrow Near (v_{i} . x)$
19:: for each $v_{n e a r} \in V_{n e a r}^{'}$ do
20:: $c_{n e w} \leftarrow v_{i} . c + \cos t (edge (v_{i}, v_{n e a r}))$
21:: if $FeasibilityCheck (A_{ϕ}, v_{i}, v_{n e a r}) \land c_{n e w} < v_{n e a r} . c$ then
22:: $v_{n e a r} . c \leftarrow c_{n e w}$
23:: $T . E \leftarrow T . E ∖ {edge (Parent (v_{n e a r}), v_{n e a r})}$
24:: $T . E \leftarrow T . E \cup {edge (v_{i}, v_{n e a r})}$
25:: end if
26:: end for
27:: $v_{p a r e n t} = v_{i}$
28:: end for

4.3.4. Deep-Learning-Based Extension

The DL_Extend function is an integral component of the algorithm, specifically designed to identify low-cost path segments connecting two points. In the context of Algorithm 2 (line 4),

DL_extend (v_{s} . x, p_{t})

is employed to generate a path segment starting from point

v_{s} . x

and targeting the endpoint

p_{t}

. The primary aim of this function is to expedite the identification of low-cost trajectory options within the planning process. To achieve this, the function strategically guides the sampling towards areas with high potential for low-cost paths, utilizing a learned sample distribution to ascertain the most efficient route.

Prior to delving into the specifics of the deep learning network, it is essential to define the structure of the input and output data used by the network, as illustrated in Figure 4.

The process begins with the derivation of an optimal trajectory and corresponding control inputs that connect an initial state

x_{s}

to a final state

x_{e}

, taking into account both the costmap c and obstacles

o b s

. An input image, denoted as X, is then constructed based on this information. This composite image is formed by concatenating four distinct elements: the costmap c, the obstacle map

o b s

, and the starting (

x_{s}

) and ending (

x_{e}

) states. As a result, a four-channel image is produced, in which the starting and ending states are depicted as regions

r_{s}

and

r_{e}

, each positioned within its respective channel. This approach ensures a comprehensive representation of the planning environment, facilitating the deep learning network’s ability to accurately predict low-cost path segments.

In the training phase of the proposed deep learning network, the generated input image X is utilized as the input data, whereas the control U is considered the output data. It is crucial to emphasize that the strategy implemented for curating the training dataset is meticulously aligned with this defined input–output structure. This alignment ensures that the deep learning model is effectively trained on data that are representative of the actual scenarios it will encounter, thereby enhancing its ability to accurately predict control sequences for low-cost path segments in the context of the path planning algorithm.

The architecture presented in this paper is based on the use of a Conditional Variational Autoencoder (CVAE) [47], which is an enhanced form of the standard variational autoencoder. The CVAE is particularly adept at conditional data generation, enabling the sampling of data from the model’s latent space. In the context of motion planning, these conditioning variables include input features relevant to the current planning scenario, with the output representing the trajectory or path segment.

The application of CVAE in sampling-based path planning algorithms has been previously investigated [25]. The method developed in this study further refines the use of the deep learning network, placing a special emphasis on the generation of control sequences. This concept of generating control sequences using deep learning has been previously explored in motion forecasting research [49,50]. The refinement aims to enhance the network’s capability to accurately predict control inputs for path segments, thereby improving the efficiency and effectiveness of the path planning process.

Figure 5 presents a detailed overview of the network architecture proposed in the study. The process begins with the extraction of the feature F from the input image X and the initial state

x_{s}

using a Feature extractor network. This extracted feature F serves as the conditional variable in the Conditional Variational Autoencoder (CVAE) framework.

Consistent with the CVAE methodology, the architecture defines three parameterized densities: the recognition model

q_{ϕ} (Z | F, U)

, the (conditional) prior model

p_{θ} (Z | F)

, and the generation model

p_{θ} (U | F, Z)

, where

θ

and

ϕ

are the parameter vectors. For the sake of clarity, the notations

θ

and

ϕ

are specifically used in this context.

Both the encoder and decoder segments of the network incorporate Long Short-Term Memory units (LSTMs) [51], a choice aimed at capturing the temporal dynamics effectively. In the decoder model, particularly, the control outputs are structured as a Gaussian Mixture Model (GMM). This configuration allows for a more nuanced and accurate representation of the real data distribution. The input for each LSTM cell in the sequence is sampled from this GMM, facilitating the generation of control sequences that are reflective of the underlying data characteristics. This approach significantly enhances the network’s ability to predict control outputs that are conducive to efficient and effective path planning.

In our study, we formalized the three parameterized density models in the subsequent manner:

The posterior density of the latent variables, given features F and control sequence U, is modeled as a Gaussian distribution:

$q_{ϕ} (Z | F, U) = N (μ_{ϕ} (F, U), Σ_{ϕ} (F, U)),$

where functions $μ_{ϕ} (F, U)$ and $Σ_{ϕ} (F, U)$ are outputs from an encoder neural network.
The prior density of the latent variables, given only the features F, is set as a standard Gaussian:

$p_{θ} (Z | F) = N (0, I) .$
The conditional probability of the control sequence, given the features F and latent variables Z, is expressed as a product of conditional probabilities for each control output:

$p_{θ} (U | F, Z) = \prod_{i = 0}^{N_{u}} p_{θ} (u_{i} | F, Z, u_{0 : i - 1}),$

depicting the sequential nature of generating the control sequence $U = u_{0}, \dots, u_{N_{u}}$ , with $N_{u}$ denoting the length of the sequence.

The log-likelihood of observing the control sequence U given the features F can be broken down into two components, described by the Evidence Lower Bound (ELBO):

\begin{matrix} log p_{θ} (U | F) - D_{K L} (q_{ϕ} (Z | F, U) | | p_{θ} (Z | F, U)) = & E_{q_{ϕ} (Z | F, U)} [log p_{θ} (U | F, Z)] \\ - D_{K L} (q_{ϕ} (Z | F, U) | | p_{θ} (Z | U)) . \end{matrix}

(11)

Here,

D_{K L}

denotes the Kullback–Leibler divergence, a measure of how one probability distribution diverges from a second, reference probability distribution.

Subsequently, the ELBO elucidated above is utilized to derive our loss function:

- \sum_{k = 1}^{K} \sum_{i = 0}^{N_{u}} log (p_{θ} (u_{i}^{(k)} | F, z_{k}, u_{0 : i - 1}^{(k)})) + λ \cdot D_{K L} (N (μ_{ϕ} (F, U), Σ_{ϕ} (F, U)) | | N (0, I)) .

(12)

where

λ

is a hyperparameter that balances the reconstruction accuracy and the divergence penalty.

The loss function employed in the network architecture accounts for control sequences derived from K latent samples within the reconstruction loss term. The value of

λ

is set to 1. The optimization of this model involves minimizing the loss function, focusing on the parameters

θ

and

ϕ

. This optimization process is critical for refining the network’s ability to accurately generate control sequences, ensuring that the resulting path planning aligns effectively with the specified objectives and constraints.

In the testing phase, the decoder is employed online to project conditioned latent samples

z_{1}, \dots, z_{K}

into control sequences

{\hat{U}}_{1}, \dots, {\hat{U}}_{K}

. Control sequences that extend beyond the reach of the end state

x_{e}

are disregarded. Path segments are then reconstructed from these generated control values. The process involves selecting the path segment with the minimum cost from these reconstructed segments, which is subsequently used as the output of the DL_Extend function. This approach ensures that the most cost-effective trajectory is chosen, aligning with the overall objective of optimizing path planning based on the deep learning-augmented algorithm.

4.3.5. Solution Determination Process

Upon completion of the iterative process, the algorithm identifies a valid solution if it locates a vertex corresponding to one of the accepting states, denoted by

A ϕ . Q_{a c c}

. In instances where such a vertex remains unidentified, the algorithm outputs an empty set, signifying the absence of a viable solution within the designated computational time frame.

If a solution is pinpointed, the algorithm proceeds to select the most cost-effective vertex among those associated with the accepting states of the automaton. The path to the solution, represented as

Ξ

, is reconstructed by tracing back from this optimal vertex to the tree’s root (as outlined in Algorithm 1, line 10).

5. Experimental Results

The simulation showcases outcomes utilizing the Dubins car model, governed by the dynamics

\dot{x} = v cos (θ), \dot{y} = v sin (θ), \dot{θ} = ω,

(13)

with the car’s location represented by

(x, y)

, its orientation by

θ

, and v and

ω

indicating its linear and angular velocities, respectively. For this simulation, we chose planning parameters

p_{D} = 0.7

and

p_{T} = 0.5

.

For the training of the DL_extend function in Algorithm 2, the dataset was generated using costmaps created through Gaussian process regression, which incorporated random obstacles. The control values connecting two states were computed using a motion planning method detailed by Kobilarov et al. [52]. This method has been foundational in shaping our approach, particularly in generating realistic control values that reflect varying environmental conditions.

In our study, we constructed a comprehensive training dataset comprising 1000 pairs of positions and control inputs from each of the 500 generated costmaps, totaling 500,000 samples. We allocated 95% of these samples, or 475,000 pairs, for training our deep learning model, while the remaining 5%, or 25,000 samples, were used as validation data to monitor and refine the model’s performance during the training process.

This dataset is crucial in enhancing the robustness and accuracy of the DL_extend function, as it enables the model to reflect diverse and realistic navigation scenarios effectively.

Our model’s architecture and the corresponding training parameters are detailed as follows:

Feature extractor: The initial stage of our model consists of four convolutional layers followed by two fully connected layers. These layers have dimensions of 256 and 128 for the hidden layers, respectively. This stage is crucial for extracting and compressing spatial features from the input data, which significantly influences the subsequent stages of trajectory prediction. The output dimension of the feature extractor is configured to 64.
Dimension of latent variable: The dimension of the latent space within our model is set to 32. This specification balances the complexity of the model with computational efficiency, allowing for effective feature representation and subsequent trajectory generation.
LSTM encoder–decoder: At the core of our model is an LSTM-based encoder–decoder structure, where both the encoder and decoder are equipped with 64-dimensional LSTM units. This setup is designed to handle the sequential nature of the control signals u, structured as three-dimensional arrays with dimensions (Batch x Timestep x Control_dim). It captures temporal dependencies essential for accurate predictions of future states based on past and present control inputs.
GMM projection module: After decoding, the output passes through a GMM projection module, comprising two fully connected layers with dimensions of 256 each. This module converts the LSTM outputs into a probabilistic framework, depicting potential trajectories as a mixture of Gaussian distributions. This feature is particularly beneficial in dynamic environments where inherent uncertainties must be managed.
Training regime: The network was comprehensively trained for 300 epochs, with a batch size of 32 to balance between computational efficiency and effective learning. The chosen learning rate of $1 \times 10^{- 3}$ and a weight decay of $1 \times 10^{- 5}$ helped in fine-tuning the model parameters to minimize overfitting while optimizing performance.

The experimental setup was designed to evaluate the model’s capability to generate feasible trajectories under varying conditions. The results showcased the model’s robustness, with significant improvements in trajectory optimization demonstrated across a range of test scenarios. Figure 5 provides an overview of the network architecture, and detailed results from the training process are discussed, illustrating the effectiveness of the model in achieving low-cost, feasible trajectories within specified LTL constraints.

The loss function used for training our model, crucial for evaluating the efficacy of the trajectory predictions, is defined in Equation (12) of our manuscript. This equation is integral to understanding how the model minimizes errors and optimizes trajectory outputs during training.

The deep learning network and its training were implemented using PyTorch, and the algorithm was developed in Python. The graphical representations in this paper were generated using MATLAB’s image tool.

Three simulation experiments were conducted to demonstrate the proposed algorithm’s versatility and effectiveness. The initial experiment utilized a generated costmap to provide a controlled environment for baseline validation. The subsequent experiment employed an actual traffic accident density map from the city of Helsinki to illustrate the algorithm’s practical applicability in real-world settings. The final experiment was designed to assess the algorithm’s convergence toward the optimal solution, verifying its ability to achieve asymptotic optimality. Together, these simulations underscore the adaptability of the algorithm to a variety of scenarios.

5.1. The Generated Costmap

Three toy scenarios were explored, each with distinct costmaps, encompassing planning environments, regions of interest, and obstacles. These scenarios were tested against the following three Linear Temporal Logic (LTL) formulas:

\begin{matrix} ϕ_{t o y 1} = ◊ (a \land ◊ (b)) \\ ϕ_{t o y 2} = ◊ (a \land ◊ (b \land (◊ c))) \\ ϕ_{t o y 3} = ◊ (a) \land ◊ (b) \land ◊ (c) \end{matrix}

While

ϕ_{t o y 1}

and

ϕ_{t o y 2}

define sequential missions,

ϕ_{t o y 3}

outlines a coverage mission.

A visual snapshot of the tree extension process at a specific iteration is illustrated in Figure 6. This shows that path segments generated by the deep learning-enhanced extension predominantly traverse low-cost regions. Additionally, edges of the search tree modified by the rewiring process are displayed.

The proposed algorithm’s tree extension processes for various toy scenarios are detailed in Figure 7 and Figure 8. Over 300 iterations, the algorithm’s progression is depicted through 2D and 3D images, showcasing the expansion of the search tree at each stage (column 1∼4). For insights on the 2D images, refer to Figure 6. The final column presents the 2D representation of the solution after 300 iterations, highlighting the solution trajectory (in orange) alongside the search tree (in sky blue).

In the 3D images, individual layers correspond to specific automaton states. For instance, Figure 7B displays layers linked to automaton states

Q_{i n i t}

,

Q_{1}

,

Q_{2}

, and

Q_{a c c}

. Separately, the NFA for

ϕ_{t o y 3}

, illustrated in Figure 9, includes 8 automaton states, a result of its coverage mission specification. Concurrent tree extensions across multiple automaton states can be observed in Figure 8. By comparing intermediary solution costs to the final solution’s cost (normalized to 1) at the 300th iteration, the algorithm’s continuous refinement is evident. This enhancement is facilitated by a strategic rewiring process that frequently adjusts the edges of the search tree.

Comparative experiments, detailed in Table 1, benchmark the novel proposed approach against a variety of sampling-based algorithms in the context of path planning. In the table, “Proposed” denotes the new method introduced in this study, while “Proposed (wo/r)” represents the same method but excludes the rewiring processes. This approach is compared with two cost-aware path planning algorithms specifically designed for LTL missions: LBPP-LTL [25] and CAPP-LTL [46]. Additionally, the study includes comparisons with two LTL-constrained, sampling-based path planning algorithms: Bahita [35] and McMahon [37]. These algorithms are iterative, operating until reaching a predefined time limit, at which point the solution with the lowest cost is selected.

The results for three distinct scenarios, labeled

ϕ_{t o y 1}

,

ϕ_{t o y 2}

, and

ϕ_{t o y 3}

, are compiled in Table 1. Each algorithm was subjected to 100 independent trials, with each trial introducing a different start state. The operating time for all algorithms was fixed at 420 s, and interim assessments were conducted at 60 second intervals after the initial 120 s. For a clear comparative analysis, a relative average cost metric was employed. The final average cost of the “Proposed” approach (at time = 420 s) served as the baseline (1), and the performance of other algorithms was measured relative to this benchmark.

The analysis reveals that the “Proposed” algorithm consistently identifies more cost-efficient paths, achieving the best average cost after 420 s. A notable observation is the importance of the rewiring function, as evidenced by the comparison with “Proposed (wo/r)”. Although the initial performance of the rewiring process may seem modest, its efficiency becomes more pronounced over time. The study also highlights that the proposed method outperforms other cost-focused path planning algorithms like LBPP-LTL and CAPP-LTL. The divergence of CAPP-LTL from the proposed method is attributed to its reliance on a more time-intensive cross-entropy optimization, as opposed to the deep learning-based extension used in the proposed method. While both the proposed method and LBPP-LTL incorporate a learning-based extension during the tree extension phase, LBPP-LTL differs by requiring discrete planning in each iteration and not utilizing the specialized deep learning network for control sequence generation that is a feature of the proposed method. Conversely, non-cost-centric methods like Bahita and McMahon did not demonstrate significant reductions in the cost value as the time elapsed.

Additional experiments were conducted in scenarios with increased obstacle density. Figure 10 showcases the solution paths generated by the proposed algorithm for specific LTL formulas (

ϕ_{t o y 4}

to

ϕ_{t o y 7}

) over 700 iterations. In each scenario, the algorithm’s progress is depicted at various iteration milestones. The initial columns display the evolution of the search tree process through iterations, as well as the rewiring procedures, with reference to Figure 6. The final column highlights the final solution path in orange and the search tree in sky blue, without distinct representation of rewiring edges.

Table 2 showcases the outcomes of comparative experiments, similar to those detailed in Table 1. Each scenario was specifically designed with particular regions of interest and obstacle configurations, as shown in Figure 10. The increased obstacle density in these scenarios led to more frequent collision checks, thereby extending the computational time required. The experimental setup involved conducting 100 trials for each scenario, with randomized initial states and costmaps to ensure variability.

For the purposes of a comprehensive comparative analysis, the results were quantified using a relative average cost metric. The average cost recorded by the “Proposed” method at 1200 s served as the reference point (normalized to 1), allowing for a direct comparison with the performance metrics of other algorithms. The findings from these experiments validate the robust performance of the proposed algorithm, even in environments characterized by an increased presence of obstacles.

Figure 11 illustrates the adaptability of the proposed algorithm in an environment with a even higher density of obstacles, presenting an elevated challenge for path planning. The figure depicts the algorithm’s performance over 800 iterations for scenarios characterized by specific LTL formulas (from

ϕ_{t o y 8}

to

ϕ_{t o y 11}

). For each scenario, the progression of the algorithm is captured at predetermined iteration milestones, offering a clear view of the search tree development (indicated in sky blue) and the evolving solution path (highlighted in red). A key observation from Figure 11 is the algorithm’s consistent ability to navigate the increased complexity and identify a viable low-cost path that complies with the given LTL mission requirements. This demonstrates the algorithm’s competence in managing densely populated obstacle spaces without compromising on the efficiency or the cost-effectiveness of the path planning process.

5.2. The Helsinki Traffic Accident Map

In the study, a navigation scenario is explored where a robot is tasked with a traffic surveillance mission in a specific area of Helsinki city, as shown in Figure 12. The scenario involves four designated regions of interest, labeled alphabetically, and obstacle zones are indicated by gray boxes. A traffic accident density map for this scenario is created using accumulated traffic accident data [53], with Gaussian process regression employed for its generation. The mission’s objective is to efficiently navigate through areas prone to traffic accidents. Consequently, zones with low accident densities are assigned lower costs, while regions with high accident densities are identified as high-cost areas.

Four distinct scenarios are tested, each with varied task specifications: Scenarios 1 and 2 are sequential missions, while Scenarios 3 and 4 are coverage missions. Their respective LTL formulas are stated as below.

$Helsinki scenario 1 : ϕ_{h 1} = ◊ (a \land ◊ (b \land (◊ c)))$ ;
$Helsinki scenario 2 : ϕ_{h 2} = ◊ (a \land ◊ (b \land ◊ (c \land (◊ d))))$ ;
$Helsinki scenario 3 : ϕ_{h 3} = ◊ (a) \land ◊ (b) \land ◊ (c)$ ;
$Helsinki scenario 4 : ϕ_{h 4} = ◊ (b) \land ◊ (c) \land ◊ (d)$ .

Each LTL formula details the robot’s objectives. For instance,

ϕ_{h 1}

implies “visit regions

a, b

, and c in sequence”, whereas

ϕ_{h 3}

means “cover regions

a, b

, and c”.

Figure 13 displays the solution paths generated by the proposed algorithm in various scenarios within 400 iterations. The algorithmic progress is shown at specific iteration counts for each scenario. The initial columns illustrate the search tree’s evolution during the iterations, as well as the application of the deep learning-based extension and rewiring procedures, with reference to Figure 6. The concluding column reveals the final solution path in orange and the search tree in sky blue, without distinct emphasis on rewiring edges. As iterations progress, the algorithm consistently produces paths that adhere to the LTL mission and traverse low-cost regions effectively.

Particularly in coverage missions, as observed in Helsinki scenarios 3 and 4, significant variations are noted in the sequence of traversing regions of interest. For instance, in Helsinki scenario 3, the initial path follows the sequence

b \to c \to a

at iteration 50, shifts to

b \to c \to a

at iteration 100, and ultimately converges to a cost-effective path of

a \to c \to b

. Similarly, in Helsinki scenario 4, the initial path sequence of

c \to d \to b

is optimized to

c \to b \to d

in the final solution. The algorithm effectively explores various plans within the search tree, leading to the identification of more cost-efficient paths that fulfill the coverage mission requirements.

5.3. Convergence Analysis

Our convergence analysis within a 2D environment utilized a simple dynamic model with the state equations:

\dot{x} = v_{x}, \dot{y} = v_{y},

where

(x, y)

represents the position and

(v_{x}, v_{y})

the velocity.

The performance of our algorithm was assessed using two Linear Temporal Logic (LTL) formulas:

\begin{matrix} ϕ_{c o n v 1} & = ◊ (a) \land ◊ (b) \land ◊ (c) \land ◊ (d), \\ ϕ_{c o n v 2} & = ◊ (a) \land ◊ (b) \land ◊ (c) . \end{matrix}

Here,

ϕ_{c o n v 1}

defines a coverage mission, requiring the algorithm to visit four specified regions, while

ϕ_{c o n v 2}

outlines a sequential mission, mandating visits to three regions in a particular order.

The simulation results, depicted in Figure 14, demonstrate the algorithm’s cost convergence. As the number of iterations increases, the trajectory cost approaches the optimal value, validating the asymptotic optimality of the proposed approach.

6. Conclusions

This paper presents a novel path planning method tailored for co-safe LTL specifications, introducing significant advancements in handling logical constraints within complex environments. A distinctive feature of this approach is its reliance on minimal discrete abstraction, eliminating the need for traditional mesh decomposition. This innovation not only streamlines the extraction of discrete plans but also aligns closely with LTL missions, significantly enhancing planning efficiency by removing the necessity for intermediate discrete plan generation during iterative processes.

Our experimental results demonstrate that this methodology markedly improves planning efficiency. The integration of a deep learning network to refine the search tree expansion process has shown to enhance the system’s performance across diverse simulation environments, providing empirical evidence of the method’s effectiveness and scalability.

Despite these advancements, we acknowledge that the approach still encounters challenges related to computational efficiency, especially in high-dimensional spaces, and limitations in the rewiring process under complex system dynamics. These limitations will be the focus of our future research, aiming to refine and optimize our method further.

Looking ahead, future research will also concentrate on overcoming the challenges of high-dimensional path planning under logical constraints with greater efficiency. Particular attention will be directed towards task-based planning for multi-joint manipulators, leveraging the learning-based strategies developed here. This focus is crucial for advancing robotic and autonomous system capabilities, enabling them to navigate complex tasks and environments more effectively.

In conclusion, the proposed path planning methodology leverages minimal discrete abstraction and deep learning-based extensions to offer a significant breakthrough in developing robust and efficient autonomous systems. These systems are well-equipped to navigate intricate environments and fulfill complex mission objectives, setting a new benchmark in the field. We are committed to addressing the noted shortcomings in future iterations of our research, enhancing both the practicality and effectiveness of our approach.

Author Contributions

Conceptualization, C.B. and K.C.; methodology, C.B.; validation, K.C.; data curation, C.B. and K.C.; writing—original draft preparation, C.B.; writing—review and editing, K.C.; visualization, C.B. and K.C.; supervision, K.C. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by Innovative Human Resource Development for Local Intellectualization program through the Institute of Information & Communications Technology Planning & Evaluation(IITP) grant funded by the Korea government(MSIT)(IITP-2024-RS-2023-00259678).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Cui, J.; Trinkle, J. Toward next-generation learned robot manipulation. Sci. Robot. 2021, 6, eabd9461. [Google Scholar] [CrossRef] [PubMed]
Pairet, È.; Chamzas, C.; Petillot, Y.; Kavraki, L.E. Path planning for manipulation using experience-driven random trees. IEEE Int. Conf. Robot. Autom. 2021, 6, 3295–3302. [Google Scholar] [CrossRef]
Qureshi, A.H.; Dong, J.; Baig, A.; Yip, M.C. Constrained motion planning networks x. IEEE Trans. Robot. 2021, 38, 868–886. [Google Scholar] [CrossRef]
Xu, K.; Yu, H.; Huang, R.; Guo, D.; Wang, Y.; Xiong, R. Efficient Object Manipulation to an Arbitrary Goal Pose: Learning-based Anytime Prioritized Planning. In Proceedings of the IEEE International Conference on Robotics and Automation, Philadelphia, PA, USA, 23–27 May 2022; pp. 7277–7283. [Google Scholar]
Nair, S.; Rajeswaran, A.; Kumar, V.; Finn, C.; Gupta, A. R3m: A universal visual representation for robot manipulation. In Proceedings of the Conference on Robot Learning, Auckland, New Zealand, 14–18 December 2022; pp. 416–426. [Google Scholar]
Fridovich-Keil, D.; Bajcsy, A.; Fisac, J.F.; Herbert, S.L.; Wang, S.; Dragan, A.D.; Tomlin, C.J. Confidence-aware motion prediction for real-time collision avoidance1. Int. J. Robot. Res. 2020, 39, 250–265. [Google Scholar] [CrossRef]
Xie, L.; Miao, Y.; Wang, S.; Blunsom, P.; Wang, Z.; Chen, C.; Markham, A.; Trigoni, N. Learning with stochastic guidance for robot navigation. IEEE Trans. Neural Netw. Learn. Syst. 2020, 32, 166–176. [Google Scholar] [CrossRef] [PubMed]
Eiffert, S.; Kong, H.; Pirmarzdashti, N.; Sukkarieh, S. Path planning in dynamic environments using generative rnns and monte carlo tree search. In Proceedings of the IEEE International Conference on Robotics and Automation, Paris, France, 31 May–31 August 2020; pp. 10263–10269. [Google Scholar]
Lin, J.; Zhou, T.; Zhu, D.; Liu, J.; Meng, M.Q.H. Search-Based Online Trajectory Planning for Car-like Robots in Highly Dynamic Environments. In Proceedings of the IEEE International Conference on Robotics and Automation, Xi’an, China, 30 May–5 June 2021; pp. 8151–8157. [Google Scholar]
Hayat, S.; Yanmaz, E.; Brown, T.X.; Bettstetter, C. Multi-objective UAV path planning for search and rescue. In Proceedings of the IEEE International Conference on Robotics and Automation, Singapore, 29 May–3 June 2017; pp. 5569–5574. [Google Scholar]
Cabreira, T.M.; Di Franco, C.; Ferreira, P.R.; Buttazzo, G.C. Energy-aware spiral coverage path planning for uav photogrammetric applications. IEEE Robot. Autom. Lett. 2018, 3, 3662–3668. [Google Scholar] [CrossRef]
Chen, J.; Du, C.; Zhang, Y.; Han, P.; Wei, W. A clustering-based coverage path planning method for autonomous heterogeneous UAVs. IEEE Trans. Intell. Transp. Syst. 2021, 23, 25546–25556. [Google Scholar] [CrossRef]
Li, B.; Page, B.R.; Moridian, B.; Mahmoudian, N. Collaborative mission planning for long-term operation considering energy limitations. IEEE Robot. Autom. Lett. 2020, 5, 4751–4758. [Google Scholar] [CrossRef]
Apostolidis, S.D.; Kapoutsis, P.C.; Kapoutsis, A.C.; Kosmatopoulos, E.B. Cooperative multi-UAV coverage mission planning platform for remote sensing applications. Auton. Robot. 2022, 46, 373–400. [Google Scholar] [CrossRef]
Fainekos, G.E.; Kress-Gazit, H.; Pappas, G.J. Temporal logic motion planning for mobile robots. In Proceedings of the IEEE International Conference on Robotics and Automation, Barcelona, Spain, 18–22 April 2005. [Google Scholar]
Karaman, S.; Frazzoli, E. Sampling-based motion planning with deterministic μ-calculus specifications. In Proceedings of the 48th IEEE Conference on Decision and Control (CDC) Held Jointly with 2009 28th Chinese Control Conference, Shanghai, China, 15–18 December 2009; pp. 2222–2229. [Google Scholar]
Lahijanian, M.; Wasniewski, J.; Andersson, S.B.; Belta, C. Motion planning and control from temporal logic specifications with probabilistic satisfaction guarantees. In Proceedings of the IEEE International Conference on Robotics and Automation, Anchorage, AK, USA, 3–7 May 2010; pp. 3227–3232. [Google Scholar]
Huang, Y.; Wang, N.; Li, J. A task planning method for multi-UAV from local LTL specifications. In Proceedings of the IEEE International Conference on Unmanned Systems, Beijing, China, 15–17 October 2021; pp. 618–623. [Google Scholar]
Purohit, P.; Saha, I. DT: Temporal logic path planning in a dynamic environment. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Prague, Czech Republic, 27 September–1 October 2021; pp. 3627–3634. [Google Scholar]
Nenchev, V.; Belta, C. Receding horizon robot control in partially unknown environments with temporal logic constraints. In Proceedings of the European Control Conference, Aalborg, Denmark, 29 June–1 July 2016; pp. 2614–2619. [Google Scholar]
Cai, M.; Peng, H.; Li, Z.; Gao, H.; Kan, Z. Receding horizon control-based motion planning with partially infeasible LTL constraints. IEEE Control Syst. Lett. 2020, 5, 1279–1284. [Google Scholar] [CrossRef]
Ichter, B.; Harrison, J.; Pavone, M. Learning sampling distributions for robot motion planning. In Proceedings of the IEEE International Conference on Robotics and Automation, Brisbane, QLD, Australia, 21–25 May 2018; pp. 7087–7094. [Google Scholar]
Ichter, B.; Pavone, M. Robot motion planning in learned latent spaces. IEEE Robot. Autom. Lett. 2019, 4, 2407–2414. [Google Scholar] [CrossRef]
Wang, J.; Chi, W.; Li, C.; Wang, C.; Meng, M.Q.H. Neural RRT*: Learning-based optimal path planning. IEEE Trans. Autom. Sci. Eng. 2020, 17, 1748–1758. [Google Scholar] [CrossRef]
Cho, K. Learning-based path planning under co-safe temporal logic specifications. IEEE Access 2023, 11, 25865–25878. [Google Scholar] [CrossRef]
Karaman, S.; Frazzoli, E. Sampling-based algorithms for optimal motion planning. Int. J. Robot. Res. 2011, 30, 846–894. [Google Scholar] [CrossRef]
Smith, S.L.; Tumova, J.; Belta, C.; Rus, D. Optimal path planning for surveillance with temporal logic constraints. Int. J. Robot. Res. 2011, 30, 1695–1708. [Google Scholar] [CrossRef]
Wolff, E.M.; Topcu, U.; Murray, R.M. Optimal Control with Weighted Average Costs and Temporal Logic Specifications. Robot. Sci. Syst. 2013, 8, 449–456. [Google Scholar]
LaValle, S.M.; Kuffner, J.J. Randomized kinodynamic planning. Int. J. Robot. Res. 2001, 20, 378–400. [Google Scholar] [CrossRef]
Choset, H.; Lynch, K.M.; Hutchinson, S.; Kantor, G.; Burgard, W.; Kavraki, L.E.; Thrun, S. Principles of Robot Motion: Theory, Algorithms, and Implementation; MIT Press: Cambridge, MA, USA, 2005. [Google Scholar]
LaValle, S.M. Planning Algorithms; Cambridge University Press: Cambridge, UK, 2006. [Google Scholar]
Karaman, S.; Frazzoli, E. Sampling-based algorithms for optimal motion planning with deterministic μ-calculus specifications. In Proceedings of the IEEE American Control Conference, Montreal, QC, Canada, 27–29 June 2012. [Google Scholar]
Karaman, S.; Sanfelice, R.G.; Frazzoli, E. Optimal control of mixed logical dynamical systems with linear temporal logic specifications. In Proceedings of the IEEE Conference on Decision and Control, Cancun, Mexico, 9–11 December 2008. [Google Scholar]
Wolff, E.M.; Topcu, U.; Murray, R.M. Optimization-based control of nonlinear systems with linear temporal logic specifications. In Proceedings of the IEEE International Conference on Robotics and Automation, Hong Kong, China, 31 May–7 June 2014; pp. 5319–5325. [Google Scholar]
Bhatia, A.; Kavraki, L.E.; Vardi, M.Y. Sampling-based motion planning with temporal goals. In Proceedings of the IEEE International Conference on Robotics and Automation, Anchorage, AK, USA, 3–7 May 2010. [Google Scholar]
Plaku, E. Planning in discrete and continuous spaces: From LTL tasks to robot motions. In Proceedings of the Towards Autonomous Robotic Systems, Bristol, UK, 20–23 August 2012. [Google Scholar]
McMahon, J.; Plaku, E. Sampling-based tree search with discrete abstractions for motion planning with dynamics and temporal logic. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Chicago, IL, USA, 14–18 September 2014. [Google Scholar]
Cho, K. A hierarchical learning approach to autonomous driving using rule specifications. IEEE Access 2022, 10, 74815–74824. [Google Scholar] [CrossRef]
Aloor, J.J.; Patrikar, J.; Kapoor, P.; Oh, J.; Scherer, S. Follow the rules: Online signal temporal logic tree search for guided imitation learning in stochastic domains. In Proceedings of the IEEE International Conference on Robotics and Automation, London, UK, 29 May–2 June 2023; pp. 1320–1326. [Google Scholar]
Wang, Y.; Figueroa, N.; Li, S.; Shah, A.; Shah, J. Temporal Logic Imitation: Learning Plan-Satisficing Motion Policies from Demonstrations. arXiv 2022, arXiv:2206.04632. [Google Scholar]
Dhonthi, A.; Schillinger, P.; Rozo, L.; Nardi, D. Optimizing demonstrated robot manipulation skills for temporal logic constraints. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Kyoto, Japan, 23–27 October 2022; pp. 1255–1262. [Google Scholar]
Baier, C.; Katoen, J.P. Principles of Model Checking; MIT Press: Cambridge, MA, USA, 2008. [Google Scholar]
Sistla, A.P. Safety, liveness and fairness in temporal logic. Form. Asp. Comput. 1994, 6, 495–511. [Google Scholar] [CrossRef]
Kupferman, O.; Vardi, M.Y. Model checking of safety properties. Form. Methods Syst. Des. 2001, 19, 291–314. [Google Scholar] [CrossRef]
Plaku, E.; Kavraki, L.E.; Vardi, M.Y. Motion planning with dynamics by a synergistic combination of layers of planning. IEEE Trans. Robot. 2010, 26, 469–482. [Google Scholar] [CrossRef]
Cho, K.; Suh, J.; Tomlin, C.J.; Oh, S. Cost-aware path planning under co-safe temporal logic specifications. IEEE Robot. Autom. Lett. 2017, 2, 2308–2315. [Google Scholar] [CrossRef]
Sohn, K.; Lee, H.; Yan, X. Learning structured output representation using deep conditional generative models. Proc. Adv. Neural Inf. Process. Syst. 2015, 28. [Google Scholar]
Sutskever, I.; Vinyals, O.; Le, Q.V. Sequence to sequence learning with neural networks. Adv. Neural Inf. Process. Syst. 2014, 27. [Google Scholar]
Lee, N.; Choi, W.; Vernaza, P.; Choy, C.B.; Torr, P.H.; Chandraker, M. Desire: Distant future prediction in dynamic scenes with interacting agents. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 336–345. [Google Scholar]
Schmerling, E.; Leung, K.; Vollprecht, W.; Pavone, M. Multimodal probabilistic model-based planning for human-robot interaction. In Proceedings of the IEEE International Conference on Robotics and Automation, Brisbane, QLD, Australia, 21–25 May 2018; pp. 1–9. [Google Scholar]
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
Kobilarov, M. Cross-entropy motion planning. Int. J. Robot. Res. 2012, 31, 855–871. [Google Scholar] [CrossRef]
Traffic Accidents in Helsinki. Available online: http://www.hri.fi/en/dataset/liikenneonnettomuudet-helsingissa (accessed on 23 March 2024).

Figure 1. An overview of the proposed method, illustrating the interplay between the high-level layer that meets LTL specifications and the low-level layer that searches for low-cost trajectories, underscored by enhancements from our previous work [25].

Figure 2. An illustrative trajectory generated by the proposed method within the test scenario. The search tree, visualized in orange, strategically guides the trajectory through areas of lower cost, as indicated by the blue regions, while adhering to the specified LTL formula

ϕ = ◊ (a \land ◊ (b))

.

Figure 2. An illustrative trajectory generated by the proposed method within the test scenario. The search tree, visualized in orange, strategically guides the trajectory through areas of lower cost, as indicated by the blue regions, while adhering to the specified LTL formula

ϕ = ◊ (a \land ◊ (b))

.

Figure 3. An NFA for the syntactically co-safe LTL formula

ϕ = ◊ (a \land ◊ (b \land ◊ (c)))

. There are four automaton states with input alphabets shown in transition relations.

Figure 3. An NFA for the syntactically co-safe LTL formula

ϕ = ◊ (a \land ◊ (b \land ◊ (c)))

. There are four automaton states with input alphabets shown in transition relations.

Figure 4. Input and output data configuration for the proposed deep learning network, involving the optimal path connecting the starting state and end state.

Figure 5. Training and testing phases of the proposed architecture. Predicted values employ the hat notation for representation. Components like the costmap (c), obstacles (

o b s

), start state (

x_{s}

), and end state (

x_{e}

) are merged into feature F via the Feature extractor network. The output control sequence samples are represented by

{\hat{U}}_{1}, \dots, {\hat{U}}_{K}

and are transmuted into trajectories through the dynamic model (1). In the testing phase, the decoder operates online to convert conditioned latent samples

z_{1}, \dots, z_{K}

into the control distribution.

Figure 5. Training and testing phases of the proposed architecture. Predicted values employ the hat notation for representation. Components like the costmap (c), obstacles (

o b s

), start state (

x_{s}

), and end state (

x_{e}

) are merged into feature F via the Feature extractor network. The output control sequence samples are represented by

{\hat{U}}_{1}, \dots, {\hat{U}}_{K}

and are transmuted into trajectories through the dynamic model (1). In the testing phase, the decoder operates online to convert conditioned latent samples

z_{1}, \dots, z_{K}

into the control distribution.

Figure 6. A snapshot of the tree extension process. Within the learning-enhanced extension, various path segments traverse the low-cost region, and the minimum-cost path segment chosen to extend the search tree.

Figure 7. Visualization of the tree extension process for LTL formulas

ϕ_{t o y 1}

and

ϕ_{t o y 2}

using the proposed algorithm. In the cost map, areas in blue signify low cost, while those in yellow indicate high cost. The extent of tree extension for distinct iteration steps is illustrated in both 2D and 3D. For the 3D visualizations, distinct layers exist, with each layer signifying the search tree associated with a specific automaton state. Annotations beneath each depiction provide insights into the trajectory cost (scaled against the normalized final solution cost) and its compliance with the respective LTL formula.

Figure 7. Visualization of the tree extension process for LTL formulas

ϕ_{t o y 1}

and

ϕ_{t o y 2}

using the proposed algorithm. In the cost map, areas in blue signify low cost, while those in yellow indicate high cost. The extent of tree extension for distinct iteration steps is illustrated in both 2D and 3D. For the 3D visualizations, distinct layers exist, with each layer signifying the search tree associated with a specific automaton state. Annotations beneath each depiction provide insights into the trajectory cost (scaled against the normalized final solution cost) and its compliance with the respective LTL formula.

Figure 8. Visualization of the tree extension process using the proposed algorithm for LTL formula

ϕ_{t o y 3} = ◊ (a) \land ◊ (b) \land ◊ (c)

. Refer to Figure 7 for detailed annotations and explanations.

Figure 8. Visualization of the tree extension process using the proposed algorithm for LTL formula

ϕ_{t o y 3} = ◊ (a) \land ◊ (b) \land ◊ (c)

. Refer to Figure 7 for detailed annotations and explanations.

Figure 9. An NFA representing the LTL formula

ϕ_{t o y 3} = ◊ (a) \land ◊ (b) \land ◊ (c)

with 8 automaton states.

Figure 9. An NFA representing the LTL formula

ϕ_{t o y 3} = ◊ (a) \land ◊ (b) \land ◊ (c)

with 8 automaton states.

Figure 10. Solution paths and search tree progression demonstrated by the proposed algorithm in scenarios

ϕ_{t o y 4}

to

ϕ_{t o y 7}

, featuring additional obstacles. In

ϕ_{t o y 6}

, the symbol w represents the workspace areas that are not designated as regions of interest or obstacles. For the purpose of comparative analysis, the trajectory cost in these scenarios is normalized to 1 at the 700th iteration.

Figure 10. Solution paths and search tree progression demonstrated by the proposed algorithm in scenarios

ϕ_{t o y 4}

to

ϕ_{t o y 7}

, featuring additional obstacles. In

ϕ_{t o y 6}

, the symbol w represents the workspace areas that are not designated as regions of interest or obstacles. For the purpose of comparative analysis, the trajectory cost in these scenarios is normalized to 1 at the 700th iteration.

Figure 11. Solution paths and search tree progression for LTL formulas

ϕ t o y 8

to

ϕ t o y 11

are presented by the proposed algorithm under conditions of increased obstacle density. To facilitate comparative analysis, we normalize the trajectory costs to 1 at the 800th iteration, underscoring the algorithm’s efficiency even in these more intricate scenarios.

Figure 11. Solution paths and search tree progression for LTL formulas

ϕ t o y 8

to

ϕ t o y 11

are presented by the proposed algorithm under conditions of increased obstacle density. To facilitate comparative analysis, we normalize the trajectory costs to 1 at the 800th iteration, underscoring the algorithm’s efficiency even in these more intricate scenarios.

Figure 12. The Helsinki traffic accident map. (a) Areas in Helsinki where traffic accidents occurred are marked with red dots. Alphabetically labeled regions represent regions of interest, while gray zones denote obstacles. (b) A density map generated from traffic accident data with blue indicating low accident density and red indicating high accident density.

Figure 13. Visual representation of the tree extension method applied to the Helsinki scenarios. Annotations beneath each image convey the trajectory cost relative to the final solution’s cost (iter = 400), which is normalized to 1.

Figure 14. Convergence analysis for two LTL missions. The left panels show the solution paths generated by the proposed algorithm (in orange), starting from the initial state (marked by a blue square), with the search trees in sky blue. The right panels plot the trajectory costs over iterations (in red), with the optimal costs depicted for reference (in blue).

Table 1. Comparative analysis of path planning algorithms across toy scenarios (

ϕ_{t o y 1} \sim ϕ_{t o y 3}

) This table delineates the comparative performance of various path planning algorithms across three toy scenarios, labeled as

ϕ_{t o y 1}

,

ϕ_{t o y 2}

, and

ϕ_{t o y 3}

. It showcases the relative average trajectory costs over 100 trials, with each algorithm’s cost normalized against the results of the “Proposed” method at 420 s. The comparison highlights the efficiency and effectiveness of the “Proposed” approach relative to other established algorithms in the field.

Table 1. Comparative analysis of path planning algorithms across toy scenarios (

ϕ_{t o y 1} \sim ϕ_{t o y 3}

) This table delineates the comparative performance of various path planning algorithms across three toy scenarios, labeled as

ϕ_{t o y 1}

,

ϕ_{t o y 2}

, and

ϕ_{t o y 3}

. It showcases the relative average trajectory costs over 100 trials, with each algorithm’s cost normalized against the results of the “Proposed” method at 420 s. The comparison highlights the efficiency and effectiveness of the “Proposed” approach relative to other established algorithms in the field.

	Method	Time
	Method	120 s	180 s	240 s	300 s	360 s	420 s
$ϕ_{t o y 1}$	Proposed	1.72	1.53	1.41	1.25	1.15	1.00
	Proposed (wo/r)	1.71	1.51	1.47	1.38	1.31	1.23
	LBPP-LTL	1.75	1.62	1.48	1.36	1.22	1.12
	CAPP-LTL	1.89	1.79	1.71	1.65	1.58	1.50
	Bahita	2.39	2.27	2.22	2.18	2.11	2.08
	McMahon	2.21	2.10	2.01	1.96	1.92	1.83
$ϕ_{t o y 2}$	Proposed	1.45	1.37	1.29	1.19	1.11	1.00
	Proposed (wo/r)	1.43	1.36	1.28	1.22	1.19	1.15
	LBPP-LTL	1.52	1.47	1.35	1.24	1.17	1.07
	CAPP-LTL	1.65	1.55	1.47	1.41	1.32	1.20
	Bahita	2.12	2.01	1.97	1.91	1.88	1.85
	McMahon	1.89	1.82	1.78	1.73	1.71	1.69
$ϕ_{t o y 3}$	Proposed	1.96	1.76	1.52	1.39	1.21	1.00
	Proposed (wo/r)	1.95	1.74	1.49	1.41	1.35	1.28
	LBPP-LTL	2.01	1.82	1.64	1.49	1.38	1.24
	CAPP-LTL	2.13	1.91	1.83	1.72	1.64	1.48
	Bahita	2.46	2.37	2.29	2.17	2.11	2.04
	McMahon	2.27	2.15	2.09	2.04	1.97	1.92

Table 2. Comparative performance of path planning algorithms in enhanced obstacle scenarios (

ϕ_{t o y 4} \sim ϕ_{t o y 7}

) This table presents a comparative analysis of various path planning algorithms across enhanced obstacle scenarios, denoted as

ϕ_{t o y 4}

through

ϕ_{t o y 7}

. It tabulates the relative average trajectory costs over 100 trials for each scenario, normalized against the results of the “Proposed” method at 1200 s.

Table 2. Comparative performance of path planning algorithms in enhanced obstacle scenarios (

ϕ_{t o y 4} \sim ϕ_{t o y 7}

) This table presents a comparative analysis of various path planning algorithms across enhanced obstacle scenarios, denoted as

ϕ_{t o y 4}

through

ϕ_{t o y 7}

. It tabulates the relative average trajectory costs over 100 trials for each scenario, normalized against the results of the “Proposed” method at 1200 s.

	Method	Time
	Method	200 s	400 s	600 s	800 s	1000 s	1200 s
$ϕ_{t o y 4}$	Proposed	1.72	1.56	1.38	1.25	1.18	1.00
	Proposed (wo/r)	1.80	1.65	1.47	1.38	1.30	1.23
	LBPP-LTL	1.75	1.59	1.41	1.29	1.23	1.08
	CAPP-LTL	1.83	1.70	1.51	1.42	1.35	1.28
	Bahita	2.61	2.53	2.45	2.40	2.33	2.29
	McMahon	2.35	2.28	2.21	2.13	2.05	1.99
$ϕ_{t o y 5}$	Proposed	2.26	1.81	1.52	1.32	1.20	1.00
	Proposed (wo/r)	2.31	1.84	1.57	1.36	1.29	1.25
	LBPP-LTL	2.29	1.85	1.53	1.35	1.24	1.10
	CAPP-LTL	2.37	1.92	1.60	1.39	1.28	1.19
	Bahita	2.45	2.31	2.25	2.19	2.08	1.96
	McMahon	2.28	2.12	2.04	1.96	1.91	1.88
$ϕ_{t o y 6}$	Proposed	1.29	1.24	1.19	1.11	1.05	1.00
	Proposed (wo/r)	1.32	1.27	1.20	1.18	1.16	1.14
	LBPP-LTL	1.33	1.25	1.21	1.14	1.09	1.06
	CAPP-LTL	1.39	1.30	1.26	1.20	1.14	1.11
	Bahita	1.68	1.62	1.59	1.52	1.49	1.45
	McMahon	1.65	1.60	1.56	1.51	1.44	1.41
$ϕ_{t o y 7}$	Proposed	1.42	1.34	1.27	1.19	1.09	1.00
	Proposed (wo/r)	1.46	1.35	1.29	1.23	1.19	1.12
	LBPP-LTL	1.47	1.36	1.28	1.21	1.14	1.08
	CAPP-LTL	1.56	1.44	1.36	1.28	1.21	1.16
	Bahita	1.89	1.81	1.76	1.72	1.69	1.65
	McMahon	1.75	1.73	1.70	1.64	1.62	1.59

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Baek, C.; Cho, K. Deep Learning-Enhanced Sampling-Based Path Planning for LTL Mission Specifications. Sensors 2024, 24, 2998. https://doi.org/10.3390/s24102998

AMA Style

Baek C, Cho K. Deep Learning-Enhanced Sampling-Based Path Planning for LTL Mission Specifications. Sensors. 2024; 24(10):2998. https://doi.org/10.3390/s24102998

Chicago/Turabian Style

Baek, Changmin, and Kyunghoon Cho. 2024. "Deep Learning-Enhanced Sampling-Based Path Planning for LTL Mission Specifications" Sensors 24, no. 10: 2998. https://doi.org/10.3390/s24102998

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Deep Learning-Enhanced Sampling-Based Path Planning for LTL Mission Specifications

Abstract

1. Introduction

2. Related Work

3. Preliminaries

3.1. System Model

3.2. Linear Temporal Logic (LTL)

3.2.1. Automaton Representation

3.2.2. LTL Semantics over Trajectories

4. Proposed Method

4.1. Minimal Discrete Abstraction

4.2. Search Tree

4.3. Algorithm

4.3.1. Selection of an Initial State

4.3.2. Selection of a Target State

4.3.3. Search Tree Expansion

4.3.4. Deep-Learning-Based Extension

4.3.5. Solution Determination Process

5. Experimental Results

5.1. The Generated Costmap

5.2. The Helsinki Traffic Accident Map

5.3. Convergence Analysis

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI