Avoiding the Crash: A Vision-Language Model Evaluation of Critical Traffic Scenarios
Key Contributions & Takeaways
- Demonstrates that state-of-the-art VLMs (LLaVA-7B and MoE-LLaVA) identify potential crashes 1.13 to 1.33 seconds faster than human drivers.
- Introduces Crash Prevention Efficiency (CPE), a novel metric evaluating VLM timing and proximity performance in crash sequences.
- Provides a comprehensive evaluation framework for ADS real-time decision support based on frame-by-frame critical scenario diagnostics.
AV systems rely on DNNs that require constant retraining. When outdated, they fail to recognize novel crash scenarios, causing preventable accidents.
Real-world dashcam crash footage is decomposed frame-by-frame. LLaVA-7B and MoE-LLaVA predict the safest driving action at each frame: brake, accelerate, or turn.
Crash Prevention Efficiency scores how early and how precisely a model detects the threat, measured against the point of no return (tPNR) and crash point (tx).
"LLaVA-7B and MoE-LLaVA identified potential crash scenarios 1.13 to 1.33 seconds earlier than human drivers, highlighting their potential role in autonomous driving systems."

David Fernandez is a PhD candidate in Computer Science at Clemson University, working on safe, efficient, and explainable AI for safety-critical systems. His research spans perception, adversarial robustness, and on-device deployment of large foundation models, including LLMs and VLMs, with five first-authored publications on component-level explainability, zero-shot reasoning, and adversarial scenario analysis, alongside collaborative work on edge AI for industrial agentic systems. Much of this research is grounded in autonomous driving, where trustworthiness, latency, and robustness constraints are unforgiving, but the underlying methods transfer broadly to other high-stakes domains.
As a member of Clemson’s VIPR-GS Research Program, he develops hierarchical LLM reasoning frameworks and VLM evaluation systems for the U.S. Army’s Next Generation Combat Vehicle (NGCV) program, focusing on zero-shot reasoning and component-level explainability under real-world deployment constraints.
At BMW Group, he designs agentic AI systems for enterprise environments, building autonomous prompt optimization pipelines that enable continual agent improvement without model retraining and context-aware moderation frameworks that detect coordinated multi-turn adversarial attacks in production deployments.







