worldbench/awesome-vla-for-ad
A survey paper and curated list of vision-language-action models for autonomous driving, organized into end-to-end and dual-system paradigms.

This repository compiles and reviews vision-action and vision-language-action models used in autonomous driving systems. It traces the evolution from early VA approaches to modern VLA frameworks, categorizing methods into end-to-end VLA (integrating perception, reasoning, and planning in one model) and dual-system VLA (separating slow VLM deliberation from fast planner execution). The project includes an associated arXiv paper and a HuggingFace leaderboard for evaluating VLA models.