Command Palette
Search for a command to run...
Papers
Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

Black-Box On-Policy Distillation of Large Language Models

UniVA: Universal Video Agent towards Open-Source Next-Generation Video Generalist































Black-Box On-Policy Distillation of Large Language Models

UniVA: Universal Video Agent towards Open-Source Next-Generation Video Generalist






























PAN: A World Model for General, Interactable, and Long-Horizon World Simulation
One Small Step in Latent, One Giant Leap for Pixels: Fast Latent Upscale Adapter for Your Diffusion Models
YOLOv13: Real-Time Object Detection with Hypergraph-Enhanced Adaptive Visual Perception
MonkeyOCR: Document Parsing with a Structure-Recognition-Relation Triplet Paradigm
Consensus Sampling for Safer Generative AI
Argus: Resilience-Oriented Safety Assurance Framework for End-to-End ADSs
WMPO: World Model-based Policy Optimization for Vision-Language-Action Models
LoopTool: Closing the Data-Training Loop for Robust LLM Tool Calls
Beyond Fact Retrieval: Episodic Memory for RAG with Generative Semantic Workspaces
TiDAR: Think in Diffusion, Talk in Autoregression
Time-to-Move: Training-Free Motion Controlled Video Generation via Dual-Clock Denoising
Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds
Generating an Image From 1,000 Words: Enhancing Text-to-Image With Structured Captions
KLASS: KL-Guided Fast Inference in Masked Diffusion Models
Grounding Computer Use Agents on Human Demonstrations
Wasm: A Pipeline for Constructing Structured Arabic Interleaved Multimodal Corpora
Adaptive Multi-Agent Response Refinement in Conversational Systems
SPAN: Spatial-Projection Alignment for Monocular 3D Object Detection
Efficient Approximation of Volterra Series for High-Dimensional Systems
SofT-GRPO: Surpassing Discrete-Token LLM Reinforcement Learning via Gumbel-Reparameterized Soft-Thinking Policy Optimization
RedOne 2.0: Rethinking Domain-specific LLM Post-Training in Social Networking Services
The Station: An Open-World Environment for AI-Driven Discovery
DRIVE: Data Curation Best Practices for Reinforcement Learning with Verifiable Reward in Competitive Code Generation
IterResearch: Rethinking Long-Horizon Agents via Markovian State Reconstruction
HaluMem: Evaluating Hallucinations in Memory Systems of Agents
GVPO: Group Variance Policy Optimization for Large Language Model Post-Training
ReCA: Integrated Acceleration for Real-Time and Efficient Cooperative Embodied Autonomous Agents
DexFlyWheel: A Scalable and Self-improving Data Generation Framework for Dexterous Manipulation
NovaFlow: Zero-Shot Manipulation via Actionable Flow from Generated Videos
TreeSynth: Synthesizing Diverse Data from Scratch via Tree-Guided Subspace Partitioning
PAN: A World Model for General, Interactable, and Long-Horizon World Simulation
One Small Step in Latent, One Giant Leap for Pixels: Fast Latent Upscale Adapter for Your Diffusion Models
YOLOv13: Real-Time Object Detection with Hypergraph-Enhanced Adaptive Visual Perception
MonkeyOCR: Document Parsing with a Structure-Recognition-Relation Triplet Paradigm
Consensus Sampling for Safer Generative AI
Argus: Resilience-Oriented Safety Assurance Framework for End-to-End ADSs
WMPO: World Model-based Policy Optimization for Vision-Language-Action Models
LoopTool: Closing the Data-Training Loop for Robust LLM Tool Calls
Beyond Fact Retrieval: Episodic Memory for RAG with Generative Semantic Workspaces
TiDAR: Think in Diffusion, Talk in Autoregression
Time-to-Move: Training-Free Motion Controlled Video Generation via Dual-Clock Denoising
Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds
Generating an Image From 1,000 Words: Enhancing Text-to-Image With Structured Captions
KLASS: KL-Guided Fast Inference in Masked Diffusion Models
Grounding Computer Use Agents on Human Demonstrations
Wasm: A Pipeline for Constructing Structured Arabic Interleaved Multimodal Corpora
Adaptive Multi-Agent Response Refinement in Conversational Systems
SPAN: Spatial-Projection Alignment for Monocular 3D Object Detection
Efficient Approximation of Volterra Series for High-Dimensional Systems
SofT-GRPO: Surpassing Discrete-Token LLM Reinforcement Learning via Gumbel-Reparameterized Soft-Thinking Policy Optimization
RedOne 2.0: Rethinking Domain-specific LLM Post-Training in Social Networking Services
The Station: An Open-World Environment for AI-Driven Discovery
DRIVE: Data Curation Best Practices for Reinforcement Learning with Verifiable Reward in Competitive Code Generation
IterResearch: Rethinking Long-Horizon Agents via Markovian State Reconstruction
HaluMem: Evaluating Hallucinations in Memory Systems of Agents
GVPO: Group Variance Policy Optimization for Large Language Model Post-Training
ReCA: Integrated Acceleration for Real-Time and Efficient Cooperative Embodied Autonomous Agents
DexFlyWheel: A Scalable and Self-improving Data Generation Framework for Dexterous Manipulation
NovaFlow: Zero-Shot Manipulation via Actionable Flow from Generated Videos
TreeSynth: Synthesizing Diverse Data from Scratch via Tree-Guided Subspace Partitioning