HyperAIHyperAI

Command Palette

Search for a command to run...

AI System Combines Vision and Language to Automatically Detect Traffic Hazards from City Cameras, Boosting Proactive Road Safety

New York City’s extensive network of traffic cameras generates vast amounts of video data daily, but manually reviewing it for safety insights is impractical for most transportation agencies. Researchers at NYU Tandon School of Engineering have developed an AI system called SeeUnsafe that uses multimodal large language models to automatically detect collisions and near-misses in traffic video, transforming how cities can improve road safety without costly new infrastructure. Published in Accident Analysis & Prevention, the study earned New York City’s Vision Zero Research Award, recognizing its alignment with the city’s goal of eliminating traffic fatalities. Professor Kaan Ozbay, senior author and director of NYU Tandon’s C2SMART center, presented the findings at the eighth annual Research on the Road symposium. SeeUnsafe combines computer vision and natural language understanding to analyze long-form traffic videos. Developed through collaboration between NYU’s Center for Robotics and Embodied Intelligence and C2SMART, the system identifies dangerous events such as close calls between vehicles and pedestrians, unsafe turns, and other high-risk behaviors. It can pinpoint the location, time, and specific road users involved, offering actionable data for city planners. One of the key advantages of SeeUnsafe is that it does not require agencies to train their own AI models. It leverages pre-trained models capable of processing both visual and textual information, making it accessible even to non-experts. This reduces barriers to entry and allows cities to make better use of existing camera systems. In testing on the Toyota Woven Traffic Safety dataset, SeeUnsafe achieved a 76.71% accuracy rate in classifying videos as collisions, near-misses, or normal traffic. It correctly identified involved road users with up to 87.5% accuracy. Unlike traditional safety analysis that reacts after accidents occur, SeeUnsafe enables proactive intervention by detecting near-misses—early warning signs of potential crashes. The system generates detailed natural language reports explaining its findings, including context such as weather, traffic volume, and specific movements that contributed to incidents. This helps officials understand root causes and implement targeted improvements like adjusted traffic signals, better signage, or redesigned intersections. While the system has limitations—such as sensitivity to object tracking errors and performance challenges in low-light conditions—it represents a major step forward in using AI to interpret complex traffic environments. Researchers believe the approach could be adapted for use in in-vehicle dash cameras, enabling real-time risk assessment from a driver’s perspective. The work builds on other C2SMART initiatives, including studies on electric truck impacts on infrastructure, speed camera effectiveness, a digital twin for emergency response optimization, and monitoring of overweight vehicles on the Brooklyn-Queens Expressway. These projects collectively aim to make New York’s transportation system safer, smarter, and more resilient.

Related Links