Next Article in Journal
Beyond Conventional Drones: A Review of Unconventional Rotary-Wing UAV Design
Previous Article in Journal
Robust UAV-Oriented Wireless Communications via Multi-Agent Deep Reinforcement Learning to Optimize User Coverage
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

MultiDistiller: Efficient Multimodal 3D Detection via Knowledge Distillation for Drones and Autonomous Vehicles

1
School of Electronic Information, Wuhan University, Wuhan 430079, China
2
Shanxi Road and Bridge Group Xinzhou National Highway Project Construction Management Co., Ltd., Xinzhou 034000, China
*
Author to whom correspondence should be addressed.
Drones 2025, 9(5), 322; https://doi.org/10.3390/drones9050322
Submission received: 26 March 2025 / Revised: 20 April 2025 / Accepted: 21 April 2025 / Published: 22 April 2025
(This article belongs to the Special Issue Cooperative Perception for Modern Transportation)

Abstract

Real-time 3D object detection is a cornerstone for the safe operation of drones and autonomous vehicles (AVs)—drones must avoid millimeter-scale power lines in cluttered airspace, while AVs require instantaneous recognition of pedestrians and vehicles in dynamic urban environments. Although significant progress has been made in detection methods based on point clouds, cameras, and multimodal fusion, the computational complexity of existing high-precision models struggles to meet the real-time requirements of vehicular edge devices. Additionally, during the model lightweighting process, issues such as multimodal feature coupling failure and the imbalance between classification and localization performance often arise. To address these challenges, this paper proposes a knowledge distillation framework for multimodal 3D object detection, incorporating attention guidance, rank-aware learning, and interactive feature supervision to achieve efficient model compression and performance optimization. Specifically: To enhance the student model’s ability to focus on key channel and spatial features, we introduce attention-guided feature distillation, leveraging a bird’s-eye view foreground mask and a dual-attention mechanism. To mitigate the degradation of classification performance when transitioning from two-stage to single-stage detectors, we propose ranking-aware category distillation by modeling anchor-level distribution. To address the insufficient cross-modal feature extraction capability, we enhance the student network’s image features using the teacher network’s point cloud spatial priors, thereby constructing a LiDAR-image cross-modal feature alignment mechanism. Experimental results demonstrate the effectiveness of the proposed approach in multimodal 3D object detection. On the KITTI dataset, our method improves network performance by 4.89% even after reducing the number of channels by half.
Keywords: drones; autonomous vehicles; 3D object detection; multimodal fusion; LiDAR; knowledge distillation; intelligent perception drones; autonomous vehicles; 3D object detection; multimodal fusion; LiDAR; knowledge distillation; intelligent perception

Share and Cite

MDPI and ACS Style

Yang, B.; Tao, T.; Wu, W.; Zhang, Y.; Meng, X.; Yang, J. MultiDistiller: Efficient Multimodal 3D Detection via Knowledge Distillation for Drones and Autonomous Vehicles. Drones 2025, 9, 322. https://doi.org/10.3390/drones9050322

AMA Style

Yang B, Tao T, Wu W, Zhang Y, Meng X, Yang J. MultiDistiller: Efficient Multimodal 3D Detection via Knowledge Distillation for Drones and Autonomous Vehicles. Drones. 2025; 9(5):322. https://doi.org/10.3390/drones9050322

Chicago/Turabian Style

Yang, Binghui, Tao Tao, Wenfei Wu, Yongjun Zhang, Xiuyuan Meng, and Jianfeng Yang. 2025. "MultiDistiller: Efficient Multimodal 3D Detection via Knowledge Distillation for Drones and Autonomous Vehicles" Drones 9, no. 5: 322. https://doi.org/10.3390/drones9050322

APA Style

Yang, B., Tao, T., Wu, W., Zhang, Y., Meng, X., & Yang, J. (2025). MultiDistiller: Efficient Multimodal 3D Detection via Knowledge Distillation for Drones and Autonomous Vehicles. Drones, 9(5), 322. https://doi.org/10.3390/drones9050322

Article Metrics

Back to TopTop