Expanding Ground Vehicle Autonomy into Unstructured, Off-Road Environments: Dataset Challenges

Price, Stanton R.; Land, Haley B.; Carley, Samantha S.; Price, Steven R.; Price, Stephanie J.; Fairley, Joshua R.

doi:10.3390/app14188410

Open AccessOpinion

Expanding Ground Vehicle Autonomy into Unstructured, Off-Road Environments: Dataset Challenges

by

Stanton R. Price

^*

,

Haley B. Land

,

Samantha S. Carley

,

Steven R. Price

,

Stephanie J. Price

and

Joshua R. Fairley

U.S. Army Engineer Research and Development Center, Vicksburg, MS 39180-6199, USA

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(18), 8410; https://doi.org/10.3390/app14188410

Submission received: 1 August 2024 / Revised: 4 September 2024 / Accepted: 17 September 2024 / Published: 18 September 2024

(This article belongs to the Special Issue Advances in Autonomous Driving and Smart Transportation)

Download

Browse Figures

Versions Notes

Abstract

:

As with the broad field of deep learning, autonomy is a research topic that has experienced a heavy explosion in attention from both the scientific and commercial industries due to its potential for the advancement of humanity in many cross-cutting disciplines. Recent advancements in computer vision-based autonomy has highlighted the potential for the realization of increasingly sophisticated autonomous ground vehicles for both commercial and non-traditional applications, such as grocery delivery. Part of the success of these technologies has been a boon in the abundance of training data that is available for training the autonomous behaviors associated with their autonomy software. These data abundance advantage is quickly diminished when an application moves from structured environments, i.e., well-defined city road networks, highways, street signage, etc., into unstructured environments, i.e., cross-country, off-road, non-traditional terrains. Herein, we aim to present insights, from a dataset perspective, into how the scientific community can begin to expand autonomy into unstructured environments, while highlighting some of the key challenges that are presented with such a dynamic and ever-changing environment. Finally, a foundation is laid for the creation of a robust off-road dataset being developed by the Engineer Research and Development Center and Mississippi State University’s Center for Advanced Vehicular Systems.

Keywords:

off-road autonomy; unstructured environments; deep learning; unmanned ground vehicles; autonomous ground vehicles

1. Introduction

Rapid advancements are being made in the area of autonomy for the application of autonomous driving by unmanned ground vehicles (UGVs) [1,2,3,4,5,6,7]. In the commercial sector, autonomous vehicles are commonly referred to as ‘self-driving’ cars. Tesla and Waymo are two entities that have seen recent successes to varying degrees in their autonomous navigation aspirations in the public domain [8,9]. To the best of our knowledge, both employ deep learning (DL) methodologies for computer vision (CV) tasks to understand their sensed environment [10,11,12,13]. This scene understanding is required to enable the development of autonomous behaviors and navigation strategies for a smooth passenger experience. While there are many factors and components that go into the development of a fully autonomous UGV software stack, this paper focuses on computer vision-related tasks.

In this paper, we address a significant research gap in the field of autonomy for unstructured environments [14]. These environments have numerous potential applications, such as search and rescue missions and natural disaster responses. However, creating and using robust and diverse datasets that capture the complexity and dynamics of these environments poses significant challenges for computer vision-based autonomous navigation. We present some of the key difficulties involved in creating and utilizing such datasets, and propose a framework for developing a comprehensive and high-quality off-road dataset. By doing so, we aim to contribute to the advancement of the field and enable more effective and reliable autonomy in highly complex and unstructured environments.

The remainder of this paper is organized as follows. This section will conclude with a quick look into the most popular publicly available datasets for unstructured environments. In Section 2, challenges associated with the development of a dataset for CV-based autonomy applications are presented, along with deeper philosophical questions that must be considered when creating such a dataset. A framework for a new off-road dataset that is being developed by the Engineer Research and Development Center and Mississippi State University’s Center for Advanced Vehicular Systems will be presented in Section 3. Finally, this paper will be concluded in Section 4.

Related Datasets

The RUG-D dataset [15] aims to address the challenges of semantic understanding in unstructured outdoor environments, particularly for off-road autonomous navigation. It consists of video sequences captured by an onboard camera mounted on a small unmanned ground robot. Unlike other autonomous driving benchmark datasets, RUG-D intentionally includes a diverse range of terrain types, irregular class boundaries, minimal structured markings, and challenging visual properties commonly found in natural environments (see Figure 1). These factors make it a valuable resource for researchers working on tasks such as semantic segmentation, recognition, and detection. The dataset contains over 7000 frames, each meticulously annotated with pixel-wise labels. Researchers can use RUG-D to develop and evaluate cutting-edge techniques for various applications, including humanitarian assistance and disaster relief, agricultural robotics, environmental surveying in hazardous areas, and humanitarian demining. By providing a comprehensive and realistic representation of unstructured outdoor scenes, RUG-D contributes to advancing the field of robotics and computer vision in challenging real-world contexts.

The RELLIS-3D dataset [16] addresses a critical gap by offering data from an off-road environment. It includes annotations for both LiDAR scans and images. The dataset was collected on the Rellis Campus of Texas A&M University, and comprises 13,556 LiDAR scans and 6235 images. Notably, RELLIS-3D introduces unique challenges, such as class imbalance and environmental topography, which differ from urban environments. Current deep learning models for semantic segmentation primarily rely on large-scale datasets from urban scenes. However, off-road environments pose distinct challenges, including unstructured class boundaries, uneven terrain, and irregular features. Figure 2 shows an example image from the RELLIS-3D dataset along with its corresponding semantically labelled ground truth image. Researchers can utilize RELLIS-3D to develop more sophisticated algorithms and explore novel research directions for enhancing autonomous navigation in off-road settings. By combining LiDAR scans, color images, and other sensor data, this dataset provides valuable resources for improving semantic awareness in challenging real-world scenarios.

TartanDrive 2.0 [17], an enhanced version of its predecessor TartanDrive 1.0 [18], expands upon its already impressive collection of off-road terrain data. This updated version boasts seven hours of data captured at speeds reaching up to 15 m per second. Sample imagery is shown in Figure 3. Notably, three new LiDAR sensors have been integrated alongside the original camera: inertial, Global Positioning System (GPS), and proprioceptive sensors. These additional modalities significantly enrich the dataset, enabling researchers to delve into self-supervised learning, multi-modal perception, inverse reinforcement learning, and representation learning. The custom infrastructure provided alongside the dataset empowers end users to customize the data to suit their specific platforms. By releasing both the data and the tools used for its collection, processing, and querying, the creators of TartanDrive 2.0 actively encourage collaborative data aggregation. This resource democratizes access to large-scale datasets, fostering advancements in robotics and autonomous systems.

2. Off-Road Dataset Challenges

In this section, we delve into the challenges and considerations associated with developing an off-road dataset for computer vision-based autonomous navigation. We emphasize the significance of consistent and accurate annotations for training machine learning models that can effectively navigate complex and diverse off-road environments. Additionally, we identify several key difficulties arising from the ambiguity and inconsistency in off-road imagery, particularly regarding vegetation transitions.

2.1. Annotating Off-Road Data: Importance and Challenges

Annotating off-road data is a challenging yet crucial task in the development of autonomous navigation systems. Off-road environments pose unique challenges due to the complexity of their imagery, which is filled with diverse features at the pixel level. This demands meticulous attention to how annotations are applied, unlike urban settings where lane boundaries are well-defined and consistent. Off-road terrains lack these clear demarcations, making it difficult to train machine learning models.

This raises the question: how should we annotate off-road data for computer vision-based autonomous navigation? This is not a simple task, as it involves balancing the trade-offs between the level of detail, consistency, and usefulness of the annotations. On the one hand, we could capture the diversity and complexity of off-road environments by annotating every pixel with a specific class label, such as grass, gravel, bush, tree, etc. On the other hand, we could simplify the annotation process by using more general and abstract categories, such as path, non-path, wooded area, etc. These choices have implications for the performance and reliability of the autonomous systems that use the annotated data for learning and inference. Are we overcomplicating the task of producing a useful dataset for off-road autonomy by forcing such complicated annotations? Are we hindering the progress of off-road autonomy by ignoring the subtleties and nuances of off-road scenes? Do we truly need to differentiate between a patch of weeds in the middle of a gravel path? Or, should we consider the ontology from a broader perspective, such as path, non-path, wooded area, etc.? Understanding these scenes at a higher level could enable the development of more sophisticated autonomy behaviors to address how the UGV handles the perceived region.

2.2. Defining Boundaries in Off-Road Imagery

The transition between one off-road feature and another is often gradual and indistinct, making it challenging to annotate and train models. Determining when a path becomes a berm or when it merges into the surrounding brush is not easy, and distinguishing between a bush and a small tree can be ambiguous. These distinctions are crucial for autonomous systems that need to make real-time navigational decisions.

An important aspect of annotating off-road data are to define consistent and meaningful boundaries between different classes. This is particularly challenging for semantic segmentation models, which aim to assign a class label to every pixel in an image. Semantic segmentation is a useful technique for scene understanding and autonomous navigation, as it provides a detailed and comprehensive representation of the environment. However, the performance of semantic segmentation models depends largely on the quality and quantity of the annotated data that they are trained on.

One of the challenges in annotating off-road data are the high variability and low structure of the scenes. Unlike urban environments with clear boundaries between objects, off-road scenes lack these distinctive features. Instead, they consist of natural elements with complex shapes and textures, such as trees, bushes, grass, and rocks. These elements do not have sharp boundaries but blend into each other gradually and irregularly. For instance, the transition from a dirt path to less defined off-road ground can be subtle and indistinct, depending on factors like soil composition, vegetation coverage, and lighting conditions. Similarly, the boundary between a bush and a tree can be ambiguous because both have branches and leaves that vary in size, shape, and angle. Additionally, off-road scenes may contain small pockets of sky visible through gaps in heavily wooded areas, creating a mix of blue and green pixels that are challenging to separate.

These challenges necessitate careful consideration when annotating off-road data. Defining the relevant classes for off-road autonomy is essential. Annotating pixels that belong to multiple classes or are not clearly assignable to any one class requires a well-established strategy. Addressing the class imbalance arising from the uneven distribution of features in off-road scenes is crucial. Ensuring consistency and accuracy of annotations across different scenes and annotators must be prioritized. Finally, evaluating the quality and usefulness of the annotated data for training and testing semantic segmentation models is imperative.

These questions have significant implications for the development and performance of off-road autonomous systems. The level of detail and precision required for annotations depends on the specific goals and requirements of the autonomy algorithm. Some algorithms may only need broad and abstract categories like “path”, “non-path”, “wooded area”, etc., while others may necessitate more precise and specific labels like “grass”, “gravel”, “bush”, “tree”, etc. The choice of classes and the granularity of annotations can influence the algorithm’s ability to distinguish between different features and navigate effectively. For instance, if the algorithm needs to avoid obstacles and follow a path, it is crucial to differentiate between a patch of weeds and a gravel path, or between a small tree and a large bush. Conversely, if the algorithm is tasked with exploring and mapping an unknown terrain, recognizing the general type and layout of the environment may suffice without requiring pixel-perfect annotations.

The scientific community must address these challenges to advance the field of off-road autonomy. By developing a standardized set of definitions and annotation practices, researchers can create robust and diverse datasets that capture the intricacies and dynamics of off-road environments. These datasets can then be utilized to train and evaluate state-of-the-art semantic segmentation models that are capable of handling the complexities of off-road scenes. Enhancing the scene understanding and semantic awareness of off-road autonomous systems enables more effective and reliable autonomy in highly intricate and unstructured environments.

3. Off-Road Autonomy Dataset: A Preview

We are developing a new off-road dataset to enhance AI models’ situational understanding of their perceived environment, particularly through computer-vision-based methods. This dataset is not only designed for scene understanding, but also holds great potential for the development of autonomous UGVs. Our dataset aims to capture a diverse range of environmental seasons, ensuring a comprehensive collection of data. It will include data from multiple off-road sites, captured across all four major seasons. These collections will feature a wide variety of objects and potential maneuverability obstacles scattered throughout the scenes. Figure 4 and Figure 5 show an example set of RGB and thermal imagery that will be included in the dataset. Beyond the diverse environmental data, our dataset will also include annotated data for multiple computer-vision-based tasks. These tasks include semantic segmentation, object detection and classification, and 3D bounding boxes associated with LiDAR point cloud data. To ensure the comprehensiveness of our dataset, we initially plan to include the following sensing modalities, all accompanied by appropriately labeled ground truth data:

1.: RGB Imagery;
2.: Longwave Infrared Imagery;
3.: LiDAR 3D Point Clouds.

We plan to expand future iterations of the dataset to include additional data types, such as Inertial Measurement Unit (IMU) data, GPS data, and proprioceptive sensor data measurements. However, initial data collections in areas with dense tree cover have shown a significant degradation in GPS data. While we strive to present a robust and useful dataset upon its initial release, we anticipate continuous improvements. These improvements will involve the addition of new data sources, more accurate sensor measurements, and expanded scenarios. Our goal is to contribute a valuable resource to the AI and machine learning community, particularly those involved in off-road autonomous navigation and scene understanding.

3.1. Proposed Path Forward for Dataset Curation Supporting Off-Road Autonomy

In light of the challenges highlighted throughout this article, we propose the following strategies to advance off-road autonomy, which will guide the development of the dataset we will soon be publicly publishing.

3.1.1. High-Level, Categorical Class Labels

We propose broadening the labels to high-level classes rather than very specific class labels that require pixel-perfect labeling. The complexity of obtaining pixel-perfect annotations suggests that the only feasible approach is through simulation strategies for building training datasets. While modern technologies like UE5 have become increasingly impressive for generating photo-realistic datasets, using strictly synthetic data for training AI models has yet to be proven as a viable method. We agree with the literature that emphasizes the value of using both synthetic and real-world imagery for training AI models [19,20,21,22]. Synthetic data can serve as a valuable supplement to real-world datasets. Therefore, we propose a generalized categorical approach to class labels. For instance, paths, regardless of their material composition, should be labeled as the general class label ‘path’ rather than breaking them down into different types of paths, such as gravel, dirt, grass, etc. To address this, it is important to integrate contextual information or develop multi-stage models that first classify the broader “path” category before optionally refining specific sub-classes during post-processing or further model stages. Additionally, while synthetic data plays a crucial role in supplementing real-world data, domain adaptation techniques, such as style transfer or domain randomization, should be applied to bridge the gap between synthetic and real-world data. These methods can help the model generalize better by exposing it to variations and noise typical of off-road environments.

Another key consideration we acknowledge is the potential impact of generalized labels on model interpretability and decision-making in critical scenarios. In off-road environments, small but significant terrain variations, such as the difference between wet grass and dry dirt, can alter the vehicle’s required response. Over-generalizing class labels might reduce the model’s ability to make fine-grained distinctions that are crucial for safe navigation. One way to mitigate this issue is to incorporate sensor fusion techniques, combining LiDAR, radar, and camera data to offer richer contextual insights beyond visual labels alone. Multi-sensor data allows models to infer underlying terrain properties like slipperiness or obstacle density, which are not easily captured through image segmentation alone. Additionally, hierarchical labeling systems could be employed, where high-level categories like “path” are used during initial training phases, but models can progressively refine predictions based on learned environmental context or specific downstream tasks. This layered approach can ensure that while data annotation remains manageable, the AI still retains the capability to differentiate critical subtleties in off-road terrains when needed. Combining these strategies—sensor fusion, hierarchical labeling, and domain adaptation—creates a more robust and adaptable model, capable of overcoming the inherent challenges of imperfect off-road datasets.

Figure 6 shows an example semantic segmentation map generated by a model that was trained using the RELLIS-3D dataset for an image from the dataset we are currently developing. As seen in Figure 6, a class label is shown in a light blue color that we suggest should not be differentiated from that predicted path shown in orange. We believe it is imperative to avoid hindering progress in this field due to the immense complexity involved in obtaining pixel-perfect labeling. There must be a level of acceptance that off-road autonomy inherently means training AI models on ‘dirty’ data, which means semantic labels will be imperfect. To address this and foster growth in this area, we suggest higher-level categorical labels for classes.

3.1.2. Embracing Imperfect Class Labels

Considering the challenges in achieving pixel-perfect labels, we propose shifting our focus to training AI models to identify the capabilities of UGVs based on perceived conditions and sensed system-level conditions. For instance, we could train models to recognize tire slippage based on throttle levels and real-time proprioceptive sensor measurements. By making generalizations about the scene, we can enable more sophisticated autonomy behaviors that ensure safe and successful navigation. This could lead to the development of new autonomy stacks that deviate from the conventional paradigm associated with urban/structured environments and adopt a high-level understanding of the scene, with a stronger emphasis on on-board sensing and advanced processing of critical areas. This also is an open opportunity for investigating fuzzy logic-based methodologies [23,24] for off-road autonomy applications. Figure 7 shows a set of example outputs, corresponding to the images shown in Figure 4, for a model trained for semantic segmentation using the RELLIS dataset. While the model qualitatively appears to do a decent job with its predicted pixel-wise labels, the requirement of differentiating between such nuanced classes leads to predicted labels that complicate an autonomy stack’s ability to utilize its information efficiently.

Embracing imperfect class labels can actually enhance the robustness of AI models in off-road autonomy by encouraging adaptability and resilience in uncertain environments. In real-world scenarios, perfect annotations are often unrealistic, particularly in unstructured terrains where class boundaries are ambiguous and environmental conditions fluctuate. By training models on imperfect labels, we promote the development of systems that can generalize better to unseen, real-world data. These models become less reliant on pixel-perfect segmentation, and more focused on extracting meaningful, high-level information from noisy or incomplete inputs. Furthermore, imperfect labels mimic the variability and noise that UGVs encounter in off-road environments, allowing the models to learn how to navigate with less-than-perfect data. This can reduce overfitting, making the autonomy stack more flexible and capable of handling novel terrains. Additionally, by accepting imperfect labels, we expedite the data annotation process, significantly reducing the cost and time associated with manually labeling vast amounts of off-road data. Ultimately, embracing imperfection aligns with the realities of off-road autonomy, fostering models that prioritize practicality and robustness over unattainable accuracy in labeling.

Using imperfect labels can actually be a significant benefit for off-road autonomy, as it pushes AI models to develop a higher degree of adaptability and robustness. One potential solution is leveraging weak supervision, where models learn to make sense of patterns in data with incomplete or noisy labels. This aligns well with the unpredictable and unstructured nature of off-road environments, as models trained this way are inherently better at generalizing to new and uncertain scenarios. Additionally, semi-supervised learning becomes advantageous, where a small portion of accurately labeled data are combined with a larger set of imperfect labels, allowing models to learn from a diverse range of examples and better handle real-world variability. Self-supervised learning further reinforces this benefit by enabling models to discover important features within the data autonomously, without the need for pixel-perfect annotations. This is particularly useful in environments where fine-grained details are not always necessary for successful navigation. Active learning can also take advantage of imperfect labels, as the model focuses on the most ambiguous data points, learning to handle uncertainty more efficiently. Together, these approaches make AI systems more resilient and capable of dealing with the inherent messiness of off-road data, turning the challenge of imperfect labels into an opportunity for building more flexible and robust autonomy stacks.

3.2. Multi-Stage AI System of Systems and Imitation Learning Strategy

The third major concept that we propose for off-road autonomy, and hence the datasets that are created to support their development, is to consider either a multi-stage AI system of systems or the implementation of imitation learning [25,26,27,28,29,30], much like what is done in robotics as the field strives towards the development of humanoid robots [31,32,33].

3.2.1. Multi-Stage AI System of Systems

In a multi-stage AI-based system of systems [34,35,36], one might develop a high-level AI model to perform semantic segmentation for overall scene awareness and a general understanding based on perception sensors and algorithms. Second, and third-order algorithms would be layered on top of this base understanding of the scene in real-time to guide the autonomy behavior of the UGV. These algorithms could attempt to assess the traversability of the perceived path. If no path is identified, strategies could be explored and developed for navigation under increased uncertainty. This could be coupled with a metric or methodology for a system’s risk tolerance. The risk tolerance could be based on various factors, such as mission type, urgency, cost of the system, repairability of the UGV, and so on. These factors would be important considerations under this proposed paradigm, and could be the focus of future research in this field. This strategy could leverage perception sensor information with auxiliary IMU and proprioceptive sensors as the basis for autonomy actions and behaviors.

3.2.2. Deep Reinforcement Learning-Based Imitation Learning

Another promising strategy to explore is deep reinforcement learning and task imitation [37,38,39,40,41]. This approach eliminates the need for extensive annotated datasets, which can be time-consuming and challenging to create. Task imitation is particularly appealing because the primary challenge lies in acquiring large amounts of data generated by actual human drivers. Additionally, a framework needs to be developed to distinguish between desirable and undesirable driver actions. Furthermore, the ability to identify and analyze edge cases encountered by drivers and how they responded to those situations requires human involvement and the subsequent use of these scenarios to enhance the model’s performance. This area of research is actively being explored, and has the potential to significantly contribute to the advancement of off-road autonomy. In conclusion, both the multi-stage AI system of systems and the imitation learning strategy offer promising avenues for advancing off-road autonomy. Future research in these areas has the potential to lead to substantial advancements in the field, ultimately resulting in safer and more efficient autonomous navigation in off-road environments.

4. Conclusions

In this article, we consider the intricacies and potential of creating datasets for off-road autonomous navigation. We examine existing datasets for unstructured environments, highlighting their advantages and limitations. We propose a comprehensive framework for an off-road dataset that caters to the diverse requirements of the off-road autonomy community. The initial release of our dataset will comprise over 8500 images for both RGB and thermal imagery, as well as over 8500 LiDAR scans that can be co-aligned for the development of multi-source fusion algorithms. This dataset will be publicly available, and is planned for an initial release on GitHub in March of 2025. We present a path forward to address the challenges associated with annotating off-road data, particularly in defining clear class boundaries and distinguishing between vegetation types. We emphasize the importance of developing standardized annotation practices for off-road datasets, which we believe will result in robust datasets that contribute to the advancement of off-road autonomy and facilitate the development of reliable autonomous systems in challenging environments.

Author Contributions

Conceptualization, S.R.P. (Stanton R. Price), S.R.P. (Steven R. Price) and H.B.L.; methodology, S.R.P. (Stanton R. Price), S.J.P. and H.B.L.; formal analysis, S.R.P. (Stanton R. Price), H.B.L. and S.R.P. (Steven R. Price); investigation, S.R.P. (Stanton R. Price) and S.J.P.; resources, S.R.P. (Stanton R. Price), H.B.L. and S.S.C.; data curation, S.R.P. (Stanton R. Price), H.B.L. and S.S.C.; writing—original draft preparation, S.R.P. (Stanton R. Price); writing—review and editing, S.R.P. (Stanton R. Price), H.B.L., S.S.C., S.R.P. (Steven R. Price), S.J.P. and J.R.F.; visualization, S.R.P. (Stanton R. Price) and H.B.L.; supervision, S.R.P. (Stanton R. Price), S.R.P. (Steven R. Price), S.J.P. and J.R.F.; project administration, S.J.P. and J.R.F.; funding acquisition, S.J.P. and J.R.F. All authors have read and agreed to the published version of the manuscript.

Funding

The research presented herein was funded under “Modeling and Simulation for Manned/Unmanned Teaming”, and managed and executed by the U.S. Army Engineer Research and Development Center.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Gao, Y.; Lin, T.; Borrelli, F.; Tseng, E.; Hrovat, D. Predictive control of autonomous ground vehicles with obstacle avoidance on slippery roads. In Proceedings of the Dynamic Systems and Control Conference, Cambridge, MA, USA, 12–15 September 2010; Volume 44175, pp. 265–272. [Google Scholar]
Febbo, H.; Liu, J.; Jayakumar, P.; Stein, J.L.; Ersal, T. Moving obstacle avoidance for large, high-speed autonomous ground vehicles. In Proceedings of the 2017 American Control Conference (ACC), Seattle, DC, USA, 24–26 May 2017; pp. 5568–5573. [Google Scholar]
Guastella, D.C.; Muscato, G. Learning-based methods of perception and navigation for ground vehicles in unstructured environments: A review. Sensors 2020, 21, 73. [Google Scholar] [CrossRef] [PubMed]
Islam, F.; Nabi, M.; Ball, J.E. Off-road detection analysis for autonomous ground vehicles: A review. Sensors 2022, 22, 8463. [Google Scholar] [CrossRef] [PubMed]
Wang, H.; Liu, B. Path planning and path tracking for collision avoidance of autonomous ground vehicles. IEEE Syst. J. 2021, 16, 3658–3667. [Google Scholar] [CrossRef]
Terapaptommakol, W.; Phaoharuhansa, D.; Koowattanasuchat, P.; Rajruangrabin, J. Design of obstacle avoidance for autonomous vehicle using deep Q-network and CARLA simulator. World Electr. Veh. J. 2022, 13, 239. [Google Scholar] [CrossRef]
Wang, N.; Li, X.; Zhang, K.; Wang, J.; Xie, D. A survey on path planning for autonomous ground vehicles in unstructured environments. Machines 2024, 12, 31. [Google Scholar] [CrossRef]
Tesla. Tesla Vehicle Safety Report. 2024. Available online: https://www.tesla.com/VehicleSafetyReport (accessed on 1 August 2024).
Waymo. Waymo Significantly Outperforms Comparable Human Benchmarks over 7+ Million Miles of Rider-Only Driving. 2023. Available online: https://waymo.com/blog/2023/12/waymo-significantly-outperforms-comparable-human-benchmarks-over-7-million/ (accessed on 1 August 2024).
Grigorescu, S.; Trasnea, B.; Cocias, T.; Macesanu, G. A survey of deep learning techniques for autonomous driving. J. Field Robot. 2020, 37, 362–386. [Google Scholar] [CrossRef]
Ni, J.; Chen, Y.; Chen, Y.; Zhu, J.; Ali, D.; Cao, W. A survey on theories and applications for self-driving cars based on deep learning methods. Appl. Sci. 2020, 10, 2749. [Google Scholar] [CrossRef]
Youssef, F.; Houda, B. Comparative study of end-to-end deep learning methods for self-driving car. Int. J. Intell. Syst. Appl. 2020, 10, 15. [Google Scholar] [CrossRef]
Gupta, A.; Anpalagan, A.; Guan, L.; Khwaja, A.S. Deep learning for object detection and scene perception in self-driving cars: Survey, challenges, and open issues. Array 2021, 10, 100057. [Google Scholar] [CrossRef]
Wijayathunga, L.; Rassau, A.; Chai, D. Challenges and solutions for autonomous ground robot scene understanding and navigation in unstructured outdoor environments: A review. Appl. Sci. 2023, 13, 9877. [Google Scholar] [CrossRef]
Wigness, M.; Eum, S.; Rogers, J.G.; Han, D.; Kwon, H. A RUGD Dataset for Autonomous Navigation and Visual Perception in Unstructured Outdoor Environments. In Proceedings of the International Conference on Intelligent Robots and Systems (IROS), Macau, China, 3–8 November 2019. [Google Scholar]
Jiang, P.; Osteen, P.; Wigness, M.; Saripalli, S. RELLIS-3D Dataset: Data, Benchmarks and Analysis. arXiv 2020, arXiv:2011.12954. [Google Scholar]
Sivaprakasam, M.; Maheshwari, P.; Castro, M.G.; Triest, S.; Nye, M.; Willits, S.; Saba, A.; Wang, W.; Scherer, S. TartanDrive 2.0: More Modalities and Better Infrastructure to Further Self-Supervised Learning Research in Off-Road Driving Tasks. arXiv 2024, arXiv:2402.01913. [Google Scholar]
Triest, S.; Sivaprakasam, M.; Wang, S.J.; Wang, W.; Johnson, A.M.; Scherer, S. Tartandrive: A large-scale dataset for learning off-road dynamics models. In Proceedings of the 2022 International Conference on Robotics and Automation (ICRA), Philadelphia, PA, USA, 23–27 May 2022; pp. 2546–2552. [Google Scholar]
Kishore, A.; Choe, T.E.; Kwon, J.; Park, M.; Hao, P.; Mittel, A. Synthetic data generation using imitation training. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 3078–3086. [Google Scholar]
Jaipuria, N.; Zhang, X.; Bhasin, R.; Arafa, M.; Chakravarty, P.; Shrivastava, S.; Manglani, S.; Murali, V.N. Deflating dataset bias using synthetic data augmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, WA, USA, 14–19 June 2020; pp. 772–773. [Google Scholar]
Nabati, M.; Navidan, H.; Shahbazian, R.; Ghorashi, S.A.; Windridge, D. Using synthetic data to enhance the accuracy of fingerprint-based localization: A deep learning approach. IEEE Sens. Lett. 2020, 4, 6000204. [Google Scholar] [CrossRef]
Meng, Z.; Zhao, S.; Chen, H.; Hu, M.; Tang, Y.; Song, Y. The vehicle testing based on digital twins theory for autonomous vehicles. IEEE J. Radio Freq. Identif. 2022, 6, 710–714. [Google Scholar] [CrossRef]
Price, S.R.; Price, S.R.; Anderson, D.T. Introducing fuzzy layers for deep learning. In Proceedings of the 2019 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), New Orleans, LA, USA, 23–26 June 2019; pp. 1–6. [Google Scholar]
Talpur, N.; Abdulkadir, S.J.; Alhussian, H.; Hasan, M.H.; Aziz, N.; Bamhdi, A. A comprehensive review of deep neuro-fuzzy system architectures and their optimization methods. Neural Comput. Appl. 2022, 34, 1837–1875. [Google Scholar] [CrossRef]
Pan, Y.; Cheng, C.A.; Saigol, K.; Lee, K.; Yan, X.; Theodorou, E.A.; Boots, B. Imitation learning for agile autonomous driving. Int. J. Robot. Res. 2020, 39, 286–302. [Google Scholar] [CrossRef]
Chen, J.; Yuan, B.; Tomizuka, M. Deep imitation learning for autonomous driving in generic urban scenarios with enhanced safety. In Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macau, China, 3–8 November 2019; pp. 2884–2890. [Google Scholar]
Le Mero, L.; Yi, D.; Dianati, M.; Mouzakitis, A. A survey on imitation learning techniques for end-to-end autonomous vehicles. IEEE Trans. Intell. Transp. Syst. 2022, 23, 14128–14147. [Google Scholar] [CrossRef]
Zhu, M.; Wang, Y.; Pu, Z.; Hu, J.; Wang, X.; Ke, R. Safe, efficient, and comfortable velocity control based on reinforcement learning for autonomous driving. Transp. Res. Part C Emerg. Technol. 2020, 117, 102662. [Google Scholar] [CrossRef]
Huang, Z.; Wu, J.; Lv, C. Efficient deep reinforcement learning with imitative expert priors for autonomous driving. IEEE Trans. Neural Networks Learn. Syst. 2022, 34, 7391–7403. [Google Scholar] [CrossRef]
Kiran, B.R.; Sobh, I.; Talpaert, V.; Mannion, P.; Al Sallab, A.A.; Yogamani, S.; Pérez, P. Deep reinforcement learning for autonomous driving: A survey. IEEE Trans. Intell. Transp. Syst. 2021, 23, 4909–4926. [Google Scholar] [CrossRef]
Hua, J.; Zeng, L.; Li, G.; Ju, Z. Learning for a robot: Deep reinforcement learning, imitation learning, transfer learning. Sensors 2021, 21, 1278. [Google Scholar] [CrossRef] [PubMed]
Schaal, S. Is imitation learning the route to humanoid robots? Trends Cogn. Sci. 1999, 3, 233–242. [Google Scholar] [CrossRef] [PubMed]
Johns, E. Coarse-to-fine imitation learning: Robot manipulation from a single demonstration. In Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, 30 May–5 June 2021; pp. 4613–4619. [Google Scholar]
Wang, L.; Liu, Z. Data-driven product design evaluation method based on multi-stage artificial neural network. Appl. Soft Comput. 2021, 103, 107117. [Google Scholar] [CrossRef]
Injadat, M.; Moubayed, A.; Nassif, A.B.; Shami, A. Multi-stage optimized machine learning framework for network intrusion detection. IEEE Trans. Netw. Serv. Manag. 2020, 18, 1803–1816. [Google Scholar] [CrossRef]
Vemulapalli, R.; Pouransari, H.; Faghri, F.; Mehta, S.; Farajtabar, M.; Rastegari, M.; Tuzel, O. Knowledge Transfer from Vision Foundation Models for Efficient Training of Small Task-specific Models. In Proceedings of the ICML, Vienna, Austria, 21 July 2024. [Google Scholar]
Ross, S.; Bagnell, J.A. Reinforcement and imitation learning via interactive no-regret learning. arXiv 2014, arXiv:1406.5979. [Google Scholar]
Reddy, S.; Dragan, A.D.; Levine, S. Sqil: Imitation learning via reinforcement learning with sparse rewards. arXiv 2019, arXiv:1905.11108. [Google Scholar]
Zhu, Y.; Wang, Z.; Merel, J.; Rusu, A.; Erez, T.; Cabi, S.; Tunyasuvunakool, S.; Kramár, J.; Hadsell, R.; de Freitas, N.; et al. Reinforcement and imitation learning for diverse visuomotor skills. arXiv 2018, arXiv:1802.09564. [Google Scholar]
Le, H.; Jiang, N.; Agarwal, A.; Dudík, M.; Yue, Y.; Daumé III, H. Hierarchical imitation and reinforcement learning. In Proceedings of the International Conference on Machine Learning. PMLR, Stockholm, Sweden, 10–15 July 2018; pp. 2917–2926. [Google Scholar]
Sallab, A.E.; Abdou, M.; Perot, E.; Yogamani, S. Deep reinforcement learning framework for autonomous driving. arXiv 2017, arXiv:1704.02532. [Google Scholar] [CrossRef]

Figure 1. Sample RUG-D imagery and their corresponding semantic segmentation annotations [15].

Figure 2. Sample RELLIS-3D imagery and its corresponding semantic segmentation annotation [16].

Figure 3. Sample TartanDrive 2.0 imagery [18].

Figure 4. Subset of RGB imagery to be included in our upcoming open-source off-road dataset.

Figure 5. Subset of thermal imagery to be included in our upcoming open-source off-road dataset.

Figure 6. Example output (right) from a model trained on the RELLIS dataset, applied to an image (left) from our custom dataset. Here, it is emphasized the need for a broadening of the labels to high-level classes rather than being differentiated between grass and dirt, for example. Focusing on the transition phase in the image from gravel (labeled with the orange color) to grass (labeled with the light blue color), we question whether this level of granularity is potentially having a net-negative effect on the progress of off-road autonomy.

Figure 7. Semantic segmentation outputs for a model trained on the RELLIS dataset, but applied to the imagery shown in Figure 4.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Price, S.R.; Land, H.B.; Carley, S.S.; Price, S.R.; Price, S.J.; Fairley, J.R. Expanding Ground Vehicle Autonomy into Unstructured, Off-Road Environments: Dataset Challenges. Appl. Sci. 2024, 14, 8410. https://doi.org/10.3390/app14188410

AMA Style

Price SR, Land HB, Carley SS, Price SR, Price SJ, Fairley JR. Expanding Ground Vehicle Autonomy into Unstructured, Off-Road Environments: Dataset Challenges. Applied Sciences. 2024; 14(18):8410. https://doi.org/10.3390/app14188410

Chicago/Turabian Style

Price, Stanton R., Haley B. Land, Samantha S. Carley, Steven R. Price, Stephanie J. Price, and Joshua R. Fairley. 2024. "Expanding Ground Vehicle Autonomy into Unstructured, Off-Road Environments: Dataset Challenges" Applied Sciences 14, no. 18: 8410. https://doi.org/10.3390/app14188410

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Expanding Ground Vehicle Autonomy into Unstructured, Off-Road Environments: Dataset Challenges

Abstract

1. Introduction

Related Datasets

2. Off-Road Dataset Challenges

2.1. Annotating Off-Road Data: Importance and Challenges

2.2. Defining Boundaries in Off-Road Imagery

3. Off-Road Autonomy Dataset: A Preview

3.1. Proposed Path Forward for Dataset Curation Supporting Off-Road Autonomy

3.1.1. High-Level, Categorical Class Labels

3.1.2. Embracing Imperfect Class Labels

3.2. Multi-Stage AI System of Systems and Imitation Learning Strategy

3.2.1. Multi-Stage AI System of Systems

3.2.2. Deep Reinforcement Learning-Based Imitation Learning

4. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI