Comparative Analysis of Patch Attack on VLM-Based Autonomous Driving Architectures

David Fernandez; Pedram MohajerAnsari; Amir Salarpour; Long Cheng; Abolfazl Razi; Mert D. Pesé

Comparative Analysis of Patch Attack on VLM-Based Autonomous Driving Architectures

Mar 9, 2026·

David Fernandez ,Pedram MohajerAnsari ,Amir Salarpour ,Long Cheng ,Abolfazl Razi ,Mert D. Pesé

PDF Code Cite

Abstract

Vision-language models are emerging for autonomous driving, yet their robustness to physical adversarial attacks remains unexplored. This paper presents a systematic framework for comparative adversarial evaluation across three VLM architectures: Dolphins, OmniDrive (Omni-L), and LeapVAD. Using black-box optimization with semantic homogenization for fair comparison, we evaluate physically realizable patch attacks in CARLA simulation. Results reveal severe vulnerabilities across all architectures, sustained multi-frame failures, and critical object detection degradation. Our analysis exposes distinct architectural vulnerability patterns, demonstrating that current VLM designs inadequately address adversarial threats in safety-critical autonomous driving applications.

Type

Conference paper

Publication

2026 IEEE Intelligent Vehicles Symposium (IV)

Last updated on Mar 9, 2026

Adversarial Attacks Vision Language Models Autonomous Vehicles AI Security

Authors

David Fernandez

PhD Candidate in Computer Science

David Fernandez is a PhD candidate in Computer Science at Clemson University, working on safe, efficient, and explainable AI for safety-critical systems. His research spans perception, adversarial robustness, and on-device deployment of large foundation models, including LLMs and VLMs, with five first-authored publications on component-level explainability, zero-shot reasoning, and adversarial scenario analysis, alongside collaborative work on edge AI for industrial agentic systems. Much of this research is grounded in autonomous driving, where trustworthiness, latency, and robustness constraints are unforgiving, but the underlying methods transfer broadly to other high-stakes domains.

As a member of Clemson’s VIPR-GS Research Program, he develops hierarchical LLM reasoning frameworks and VLM evaluation systems for the U.S. Army’s Next Generation Combat Vehicle (NGCV) program, focusing on zero-shot reasoning and component-level explainability under real-world deployment constraints.

At BMW Group, he designs agentic AI systems for enterprise environments, building autonomous prompt optimization pipelines that enable continual agent improvement without model retraining and context-aware moderation frameworks that detect coordinated multi-turn adversarial attacks in production deployments.

← Understanding Adversarial Transferability in Vision-Language Models for Autonomous Driving: A Cross-Architecture Analysis Apr 7, 2026

From MIRAGE to CLEAR: Component-Level Explainable Anomaly Reasoning for Autonomous Vehicle Perception Systems Jan 1, 2026 →

No results found

Comparative Analysis of Patch Attack on VLM-Based Autonomous Driving Architectures