Command Palette
Search for a command to run...
Papers
Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

ParaVT: Taming the Tool Prior Paradox for Parallel Tool Use in Agentic Video Reinforcement Learning

TriSplat: Simulation-Ready Feed-Forward 3D Scene Reconstruction































ParaVT: Taming the Tool Prior Paradox for Parallel Tool Use in Agentic Video Reinforcement Learning

TriSplat: Simulation-Ready Feed-Forward 3D Scene Reconstruction






























Foundation Protocol: A Coordination Layer for Agentic Society
WBench: A Comprehensive Multi-turn Benchmark for Interactive Video World Model Evaluation
Macaron-A2UI: A Model for Generative UI in Personal Agents
DVAO: Dynamic Variance-adaptive Advantage Optimization for Multi-reward Reinforcement Learning
ViMU: Benchmarking Video Metaphorical Understanding
SMOL: Professionally translated parallel data for 115 under-represented languages
Chi-Bench: Can AI Agents Automate End-to-End, Long-Horizon, Policy-Rich Healthcare Workflows?
Combining On-Policy Optimization and Distillation for Long-Context Reasoning in Large Language Models
Through the Lens of Contrast: Self-Improving Visual Reasoning in VLMs
HRM-Text: Efficient Pretraining Beyond Scaling
See What I Mean: Aligning Vision and Language Representations for Video Fine-grained Object Understanding
StepAudio 2.5 Technical Report
SciAtlas: A Large-Scale Knowledge Graph for Automated Scientific Research
Rethinking Cross-Layer Information Routing in Diffusion Transformers
Lens: Rethinking Training Efficiency for Foundational Text-to-Image Models
SkillOpt: Executive Strategy for Self-Evolving Agent Skills
CVEvolve: Autonomous Algorithm Discovery for Unstructured Scientific Data Processing
Poly-EPO: Training Exploratory Reasoning Models
MEMO: Memory as a Model
ACC: Compiling Agent Trajectories for Long-Context Training
Full Attention Strikes Back: Transferring Full Attention into Sparse within Hundred Training Steps
π-Bench: Evaluating Proactive Personal Assistant Agents in Long-Horizon Workflows
Perception or Prejudice: Can MLLMs Go Beyond First Impressions of Personality?
TransitLM: A Large-Scale Dataset and Benchmark for Map-Free Transit Route Generation
DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards
Interactive Evaluation Requires a Design Science
ESI-BENCH: Towards Embodied Spatial Intelligence that Closes the Perception-Action Loop
Comparative Analysis of Military Detection Using Drone Imagery Across Multiple Visual Spectrums
Automated ICD Classification of Psychiatric Diagnoses: From Classical NLP to Large Language Models
Coordinated Optimal Power Quality Management in Distribution Systems Using The Residual Capacity of Community IBRs
Foundation Protocol: A Coordination Layer for Agentic Society
WBench: A Comprehensive Multi-turn Benchmark for Interactive Video World Model Evaluation
Macaron-A2UI: A Model for Generative UI in Personal Agents
DVAO: Dynamic Variance-adaptive Advantage Optimization for Multi-reward Reinforcement Learning
ViMU: Benchmarking Video Metaphorical Understanding
SMOL: Professionally translated parallel data for 115 under-represented languages
Chi-Bench: Can AI Agents Automate End-to-End, Long-Horizon, Policy-Rich Healthcare Workflows?
Combining On-Policy Optimization and Distillation for Long-Context Reasoning in Large Language Models
Through the Lens of Contrast: Self-Improving Visual Reasoning in VLMs
HRM-Text: Efficient Pretraining Beyond Scaling
See What I Mean: Aligning Vision and Language Representations for Video Fine-grained Object Understanding
StepAudio 2.5 Technical Report
SciAtlas: A Large-Scale Knowledge Graph for Automated Scientific Research
Rethinking Cross-Layer Information Routing in Diffusion Transformers
Lens: Rethinking Training Efficiency for Foundational Text-to-Image Models
SkillOpt: Executive Strategy for Self-Evolving Agent Skills
CVEvolve: Autonomous Algorithm Discovery for Unstructured Scientific Data Processing
Poly-EPO: Training Exploratory Reasoning Models
MEMO: Memory as a Model
ACC: Compiling Agent Trajectories for Long-Context Training
Full Attention Strikes Back: Transferring Full Attention into Sparse within Hundred Training Steps
π-Bench: Evaluating Proactive Personal Assistant Agents in Long-Horizon Workflows
Perception or Prejudice: Can MLLMs Go Beyond First Impressions of Personality?
TransitLM: A Large-Scale Dataset and Benchmark for Map-Free Transit Route Generation
DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards
Interactive Evaluation Requires a Design Science
ESI-BENCH: Towards Embodied Spatial Intelligence that Closes the Perception-Action Loop
Comparative Analysis of Military Detection Using Drone Imagery Across Multiple Visual Spectrums
Automated ICD Classification of Psychiatric Diagnoses: From Classical NLP to Large Language Models
Coordinated Optimal Power Quality Management in Distribution Systems Using The Residual Capacity of Community IBRs