Command Palette
Search for a command to run...
Papers
Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

Trust-Region Behavior Blending for On-Policy Distillation

SwanVoice: Expressive Long-Form Zero-Shot Speech Synthesis for Both Monologue and Dialogue































Trust-Region Behavior Blending for On-Policy Distillation

SwanVoice: Expressive Long-Form Zero-Shot Speech Synthesis for Both Monologue and Dialogue






























Representation Forcing for Bottleneck-Free Unified Multimodal Models
GrepSeek: Training Search Agents for Direct Corpus Interaction
COLLEAGUE.SKILL: Automated AI Skill Generation via Expert Knowledge Distillation
Agentic Systems as Boosting Weak Reasoning Models
YoCausal: How Far is Video Generation from World Model? A Causality Perspective
minWM: A Full-Stack Open-Source Framework for Real-Time Interactive Video World Models
CollectionLoRA: Collecting 50 Effects in 1 LoRA via Multi-Teacher On-Policy Distillation
OmniRetrieval: Unified Retrieval across Heterogeneous Knowledge Sources
Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments
AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security
World Action Models: The Next Frontier in Embodied AI
World Action Models are Zero-shot Policies
ResearchMath-14K: Scaling Research-Level Mathematics via Agents
Self-Improving Language Models with Bidirectional Evolutionary Search
From Pixels to Words -- Towards Native One-Vision Models at Scale
Agent Explorative Policy Optimization for Multimodal Agentic Reasoning
ProRL: Effective Reinforcement Learning for Proactive Recommendation via Rectified Policy Gradient Estimation
Gamma-World: Generative Multi-Agent World Modeling Beyond Two Players
AutoFigure: Generating and Refining Publication-Ready Scientific Illustrations
AutoResearch AI: Towards AI-Powered Research Automation for Scientific Discovery
Agent Harness Engineering: A Survey
D^2-Monitor: Dynamic Safety Monitoring for Diffusion LLMs via Hesitation-Aware Routing
Geometry-Aware Representation Denoising for Robust Multi-view 3D Reconstruction
EvalVerse: Pipeline-Aware and Expert-Calibrated Benchmarking for Professional Cinematic Video Generation
MobileGym: A Verifiable and Highly Parallel Simulation Platform for Mobile GUI Agent Research
SpatialBench: Is Your Spatial Foundation Model an All-Round Player?
LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding
Gemini Embedding 2: A Native Multimodal Embedding Model from Gemini
Language Models Need Sleep
ECHO: Terminal Agents Learn World Models for Free
Representation Forcing for Bottleneck-Free Unified Multimodal Models
GrepSeek: Training Search Agents for Direct Corpus Interaction
COLLEAGUE.SKILL: Automated AI Skill Generation via Expert Knowledge Distillation
Agentic Systems as Boosting Weak Reasoning Models
YoCausal: How Far is Video Generation from World Model? A Causality Perspective
minWM: A Full-Stack Open-Source Framework for Real-Time Interactive Video World Models
CollectionLoRA: Collecting 50 Effects in 1 LoRA via Multi-Teacher On-Policy Distillation
OmniRetrieval: Unified Retrieval across Heterogeneous Knowledge Sources
Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments
AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security
World Action Models: The Next Frontier in Embodied AI
World Action Models are Zero-shot Policies
ResearchMath-14K: Scaling Research-Level Mathematics via Agents
Self-Improving Language Models with Bidirectional Evolutionary Search
From Pixels to Words -- Towards Native One-Vision Models at Scale
Agent Explorative Policy Optimization for Multimodal Agentic Reasoning
ProRL: Effective Reinforcement Learning for Proactive Recommendation via Rectified Policy Gradient Estimation
Gamma-World: Generative Multi-Agent World Modeling Beyond Two Players
AutoFigure: Generating and Refining Publication-Ready Scientific Illustrations
AutoResearch AI: Towards AI-Powered Research Automation for Scientific Discovery
Agent Harness Engineering: A Survey
D^2-Monitor: Dynamic Safety Monitoring for Diffusion LLMs via Hesitation-Aware Routing
Geometry-Aware Representation Denoising for Robust Multi-view 3D Reconstruction
EvalVerse: Pipeline-Aware and Expert-Calibrated Benchmarking for Professional Cinematic Video Generation
MobileGym: A Verifiable and Highly Parallel Simulation Platform for Mobile GUI Agent Research
SpatialBench: Is Your Spatial Foundation Model an All-Round Player?
LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding
Gemini Embedding 2: A Native Multimodal Embedding Model from Gemini
Language Models Need Sleep
ECHO: Terminal Agents Learn World Models for Free