Command Palette
Search for a command to run...
Papers
Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

V-RGBX: Video Editing with Accurate Controls over Intrinsic Properties

SVG-T2I: Scaling Up Text-to-Image Latent Diffusion Model Without Variational Autoencoder































V-RGBX: Video Editing with Accurate Controls over Intrinsic Properties

SVG-T2I: Scaling Up Text-to-Image Latent Diffusion Model Without Variational Autoencoder






























DentalGPT: Incentivizing Multimodal Complex Reasoning in Dentistry
SSRB: Direct Natural Language Querying to Massive Heterogeneous Semi-Structured Data
MUVR: A Multi-Modal Untrimmed Video Retrieval Benchmark with Multi-Level Visual Correspondence
Evaluating Gemini Robotics Policies in a Veo World Simulator
MotionEdit: Benchmarking and Learning Motion-Centric Image Editing
Achieving Olympia-Level Geometry Large Language Model Agent via Complexity Boosting Reinforcement Learning
OPV: Outcome-based Process Verifier for Efficient Long Chain-of-Thought Verification
Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation
Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving
T-pro 2.0: An Efficient Russian Hybrid-Reasoning Model and Playground
AutoGLM: Autonomous Foundation Agents for GUIs
OpenGU: A Comprehensive Benchmark for Graph Unlearning
On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models
DeepCode: Open Agentic Coding
InfiniteVL: Synergizing Linear and Sparse Attention for Highly-Efficient, Unlimited-Input Vision-Language Models
OmniPSD: Layered PSD Generation with Diffusion Transformer
HiF-VLA: Hindsight, Insight and Foresight through Motion Representation for Vision-Language-Action Models
Arbitrage: Efficient Reasoning via Advantage-Aware Speculation
Composing Concepts from Images and Videos via Concept-prompt Binding
StereoWorld: Geometry-Aware Monocular-to-Stereo Video Generation
Urania: Differentially Private Insights into AI Use
Training LLMs for Honesty via Confessions
Measuring Agents in Production
PolyMath: Evaluating Mathematical Reasoning in Multilingual Contexts
ThreadWeaver: Adaptive Threading for Efficient Parallel Reasoning in Language Models
SPARK: Stepwise Process-Aware Rewards for Reference-Free Reinforcement Learning
OneStory: Coherent Multi-Shot Video Generation with Adaptive Memory
Preserving Source Video Realism: High-Fidelity Face Swapping for Cinematic Quality
Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform
Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance
DentalGPT: Incentivizing Multimodal Complex Reasoning in Dentistry
SSRB: Direct Natural Language Querying to Massive Heterogeneous Semi-Structured Data
MUVR: A Multi-Modal Untrimmed Video Retrieval Benchmark with Multi-Level Visual Correspondence
Evaluating Gemini Robotics Policies in a Veo World Simulator
MotionEdit: Benchmarking and Learning Motion-Centric Image Editing
Achieving Olympia-Level Geometry Large Language Model Agent via Complexity Boosting Reinforcement Learning
OPV: Outcome-based Process Verifier for Efficient Long Chain-of-Thought Verification
Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation
Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving
T-pro 2.0: An Efficient Russian Hybrid-Reasoning Model and Playground
AutoGLM: Autonomous Foundation Agents for GUIs
OpenGU: A Comprehensive Benchmark for Graph Unlearning
On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models
DeepCode: Open Agentic Coding
InfiniteVL: Synergizing Linear and Sparse Attention for Highly-Efficient, Unlimited-Input Vision-Language Models
OmniPSD: Layered PSD Generation with Diffusion Transformer
HiF-VLA: Hindsight, Insight and Foresight through Motion Representation for Vision-Language-Action Models
Arbitrage: Efficient Reasoning via Advantage-Aware Speculation
Composing Concepts from Images and Videos via Concept-prompt Binding
StereoWorld: Geometry-Aware Monocular-to-Stereo Video Generation
Urania: Differentially Private Insights into AI Use
Training LLMs for Honesty via Confessions
Measuring Agents in Production
PolyMath: Evaluating Mathematical Reasoning in Multilingual Contexts
ThreadWeaver: Adaptive Threading for Efficient Parallel Reasoning in Language Models
SPARK: Stepwise Process-Aware Rewards for Reference-Free Reinforcement Learning
OneStory: Coherent Multi-Shot Video Generation with Adaptive Memory
Preserving Source Video Realism: High-Fidelity Face Swapping for Cinematic Quality
Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform
Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance