YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
EvoDriveVLA: Evolving Autonomous Driving VLA Models via Collaborative Perception-Planning Distillation
Jiajun Cao1,2β , Xiaoan Zhang1,2β , Xiaobao Wei1β , Liyuqiu Huang1,2, Wang Zijian2, Hanzhen Zhang2, Zhengyu Jia2, Wei Mao2, Xianming Liu2, Shuchang Zhou2, Yang Wang2*, Shanghang Zhang1*,
1Peking University, 2XPENG
β Equal contribution
* Corresponding authors
Vision-Language-Action models have shown great promise for autonomous driving, yet they suffer from degraded perception after unfreezing the visual encoder and struggle with accumulated instability in long-term planning. To address these challenges, we propose EvoDriveVLA a novel collaborative perception-planning distillation framework that integrates self-anchored perceptual constraints and oracle-guided trajectory optimization. Specifically, self-anchored visual distillation leverages self-anchor teacher to deliver visual anchoring constraints, regularizing student representations via trajectory-guided key-region awareness. In parallel, oracle-guided trajectory distillation employs a future-aware oracle-teacher with coarse-to-fine trajectory refinement and Monte Carlo dropout sampling to produce high-quality trajectory candidates, thereby selecting the optimal trajectory to guide the studentβs prediction.
π Citing
If you find EvoDriveVLA is useful in your research or applications, please consider giving us a star π and citing it by the following BibTeX entry:
π Acknowledgement
Our work is primarily based on the following codebases:Impromptu-VLA, FSDrive and, OmniDrive.