Command Palette
Search for a command to run...
Papers
Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

Toward Efficient Agents: Memory, Tool learning, and Planning

FutureOmni: Evaluating Future Forecasting from Omni-Modal Context for Multimodal LLMs































Toward Efficient Agents: Memory, Tool learning, and Planning

FutureOmni: Evaluating Future Forecasting from Omni-Modal Context for Multimodal LLMs






























Being-H0.5: Scaling Human-Centric Robot Learning for Cross-Embodiment Generalization
Advances and Frontiers of LLM-based Issue Resolution in Software Engineering: A Comprehensive Survey
Nemotron-Math: Efficient Long-Context Distillation of Mathematical Reasoning from Multi-Mode Supervision
Building Production-Ready Probes For Gemini
LFM2 Technical Report
CoDance: An Unbind-Rebind Paradigm for Robust Multi-Subject Animation
The Assistant Axis: Situating and Stabilizing the Default Persona of Language Models
ABC-Bench: Benchmarking Agentic Backend Coding in Real-World Development
Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge
Reasoning Models Generate Societies of Thought
A Large-Scale Study on the Development and Issues of Multi-Agent AI Systems
ACoT-VLA: Action Chain-of-Thought for Vision-Language-Action Models
When Personalization Misleads: Understanding and Mitigating Hallucinations in Personalized LLMs
RubricHub: A Comprehensive and Highly Discriminative Rubric Dataset via Automated Coarse-to-Fine Generation
Unlocking Implicit Experience: Synthesizing Tool-Use Trajectories from Text
The Poisoned Apple Effect: Strategic Manipulation of Mediated Markets via Technology Expansion of AI Agents
Your Group-Relative Advantage Is Biased
STEM: Scaling Transformers with Embedding Modules
Lost in the Noise: How Reasoning Models Fail with Contextual Distractors
Beyond Static Tools: Test-Time Tool Evolution for Scientific Reasoning
VIBE: Visual Instruction Based Editor
Collaborative Multi-Agent Test-Time Reinforcement Learning for Reasoning
Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs
Urban Socio-Semantic Segmentation with Vision-Language Reasoning
STEP3-VL-10B Technical Report
SeedFold: Scaling Biomolecular Structure Prediction
TranslateGemma Technical Report
Fast-ThinkAct: Efficient Vision-Language-Action Reasoning via Verbalizable Latent Planning
SkinFlow: Efficient Information Transmission for Open Dermatological Diagnosis via Dynamic Visual Encoding and Staged RL
A^3-Bench: Benchmarking Memory-Driven Scientific Reasoning via Anchor and Attractor Activation
Being-H0.5: Scaling Human-Centric Robot Learning for Cross-Embodiment Generalization
Advances and Frontiers of LLM-based Issue Resolution in Software Engineering: A Comprehensive Survey
Nemotron-Math: Efficient Long-Context Distillation of Mathematical Reasoning from Multi-Mode Supervision
Building Production-Ready Probes For Gemini
LFM2 Technical Report
CoDance: An Unbind-Rebind Paradigm for Robust Multi-Subject Animation
The Assistant Axis: Situating and Stabilizing the Default Persona of Language Models
ABC-Bench: Benchmarking Agentic Backend Coding in Real-World Development
Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge
Reasoning Models Generate Societies of Thought
A Large-Scale Study on the Development and Issues of Multi-Agent AI Systems
ACoT-VLA: Action Chain-of-Thought for Vision-Language-Action Models
When Personalization Misleads: Understanding and Mitigating Hallucinations in Personalized LLMs
RubricHub: A Comprehensive and Highly Discriminative Rubric Dataset via Automated Coarse-to-Fine Generation
Unlocking Implicit Experience: Synthesizing Tool-Use Trajectories from Text
The Poisoned Apple Effect: Strategic Manipulation of Mediated Markets via Technology Expansion of AI Agents
Your Group-Relative Advantage Is Biased
STEM: Scaling Transformers with Embedding Modules
Lost in the Noise: How Reasoning Models Fail with Contextual Distractors
Beyond Static Tools: Test-Time Tool Evolution for Scientific Reasoning
VIBE: Visual Instruction Based Editor
Collaborative Multi-Agent Test-Time Reinforcement Learning for Reasoning
Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs
Urban Socio-Semantic Segmentation with Vision-Language Reasoning
STEP3-VL-10B Technical Report
SeedFold: Scaling Biomolecular Structure Prediction
TranslateGemma Technical Report
Fast-ThinkAct: Efficient Vision-Language-Action Reasoning via Verbalizable Latent Planning
SkinFlow: Efficient Information Transmission for Open Dermatological Diagnosis via Dynamic Visual Encoding and Staged RL
A^3-Bench: Benchmarking Memory-Driven Scientific Reasoning via Anchor and Attractor Activation