AI Generates Driving Scenarios from Real-World Failure Recor

AI Generates Driving Scenarios from Real-World Failure Records.

Anjali Parashar, Chuchu Fan· July 1, 2026 View original

Summary

This research proposes an LLM-based pipeline to generate diverse and accurate testing scenarios for Autonomous Driving Systems (ADS) by leveraging categorical and contextual information from natural language historical failure records. The method successfully discovers critical failures within a limited testing budget.

Researchers have developed a novel pipeline for generating testing scenarios for Autonomous Driving Systems (ADS) by drawing insights from real-world failure records. Current simulation methods often rely on fixed scenario representations or extensive manual effort to design test templates. This new approach utilizes Large Language Models (LLMs) to synthesize diverse and accurate scenarios from natural language descriptions of historical crashes, such as those found in NHTSA records. The pipeline is modular and generates synthetic scenarios compatible with specific testing constraints. It was successfully applied to create a varied set of scenarios for autonomous navigation testing on the Metadrive simulator. These generated scenarios, which included combinations of road types, non-ego vehicle movements, and on-road anomalies like work zones, aligned well with specified testing conditions. Crucially, the method revealed interesting system failures within a small budget of only 20 scenarios, demonstrating an efficient way to uncover latent vulnerabilities.

Why it matters

This approach significantly enhances the efficiency and effectiveness of pre-deployment testing for autonomous driving systems, allowing developers to proactively identify and mitigate safety-critical failures using real-world data.

How to implement this in your domain

1Explore integrating LLM-based scenario generation into your autonomous system testing pipeline.
2Utilize historical failure data (e.g., incident reports, crash records) as input for generating diverse test cases.
3Develop or adapt LLM prompts to ensure generated scenarios align with specific testing constraints and environments.
4Pilot the method in a simulation environment to identify edge cases and vulnerabilities in your autonomous systems.

Who benefits

AutomotiveAutonomous VehiclesRoboticsInsuranceTransportation

Key takeaways

LLMs can generate diverse and accurate test scenarios for autonomous driving systems.
Real-world failure records are a valuable source for creating safety-critical test cases.
The method efficiently discovers system vulnerabilities within a limited testing budget.
It offers a modular approach compatible with various testing constraints.

Original post by Anjali Parashar, Chuchu Fan

"arXiv:2606.31131v1 Announce Type: new Abstract: To ensure safe on-road behavior, pre-deployment testing and failure discovery of Autonomous Driving Systems (ADS) is crucial. Present day simulation based testing methods focus largely on mathematical models for efficient search of…"

View on X

Originally posted by Anjali Parashar, Chuchu Fan on X · view source

Want to go deeper?

Turn these trends into skills with Learnijoy's hands-on AI & tech courses.

Explore courses

AI Generates Driving Scenarios from Real-World Failure Records.

Why it matters

How to implement this in your domain

Who benefits

Key takeaways

Want to go deeper?

More in AI Engineering & DevTools

Philosophical Foundations for Explainable AI in Healthcare Explored

New Metric Improves LLM Reinforcement Learning with Verifiable Rewards.

New ACE Module Boosts LLM Agent Context Management