Understanding Adversarial Transferability in Vision-Language Models for Autonomous Driving: A Cross-Architecture Analysis
Key Contributions & Takeaways
- Adversarial patches transfer across VLM architectures with 73–91% success rate; attackers need no knowledge of the deployed model to mount effective attacks.
- Introduces a Transfer Matrix framework revealing that CLIP-based vision encoders (Dolphins, LeapVAD) drive stronger bidirectional transferability than EVA-CLIP (OmniDrive).
- Attacks persist across 64–79% of frames throughout the critical decision window, too sustained for temporal filtering or ensemble defenses to reliably mitigate.
Physical adversarial patches on road signs can manipulate VLM driving decisions. Attackers typically don't know which model a vehicle uses, yet that may not matter.
Three VLMs (Dolphins, OmniDrive, LeapVAD) are evaluated with physically realizable patches in Crosswalk and Highway scenarios using Black-Box NES optimization.
Patches optimised for one model remain 73–91% effective on others. Architectural diversity alone provides limited real-world protection.
Transfer rates evaluated via TRij = ASRij / ASRii, normalising cross-architecture success against same-model baseline. Diagonal entries (self-attacks) shown in gray. All experiments conducted in CARLA simulator with physically realizable patches. Scenarios: Crosswalk/Bus Stop and Highway/Billboard.
"LLaVA-7B and MoE-LLaVA identified potential crash scenarios 1.13 to 1.33 seconds earlier than human drivers, highlighting their potential role in autonomous driving systems."

David Fernandez is a PhD candidate in Computer Science at Clemson University, working on safe, efficient, and explainable AI for safety-critical systems. His research spans perception, adversarial robustness, and on-device deployment of large foundation models, including LLMs and VLMs, with five first-authored publications on component-level explainability, zero-shot reasoning, and adversarial scenario analysis, alongside collaborative work on edge AI for industrial agentic systems. Much of this research is grounded in autonomous driving, where trustworthiness, latency, and robustness constraints are unforgiving, but the underlying methods transfer broadly to other high-stakes domains.
As a member of Clemson’s VIPR-GS Research Program, he develops hierarchical LLM reasoning frameworks and VLM evaluation systems for the U.S. Army’s Next Generation Combat Vehicle (NGCV) program, focusing on zero-shot reasoning and component-level explainability under real-world deployment constraints.
At BMW Group, he designs agentic AI systems for enterprise environments, building autonomous prompt optimization pipelines that enable continual agent improvement without model retraining and context-aware moderation frameworks that detect coordinated multi-turn adversarial attacks in production deployments.

