Command Palette
Search for a command to run...
Papers
Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

Language Models Can Learn from Verbal Feedback Without Scalar Rewards

Variational Reasoning for Language Models































Language Models Can Learn from Verbal Feedback Without Scalar Rewards

Variational Reasoning for Language Models






























EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning
MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing
Quantile Advantage Estimation for Entropy-Safe Reasoning
LongLive: Real-time Interactive Long Video Generation
Combinatorial Creativity: A New Frontier in Generalization Abilities
Causal Spatio-Temporal Prediction: An Effective and Efficient Multi-Modal Approach
Hunyuan3D-Omni: A Unified Framework for Controllable Generation of 3D Assets
Seedream 4.0: Toward Next-generation Multimodal Image Generation
Tree Search for LLM Agent Reinforcement Learning
SciReasoner: Laying the Scientific Reasoning Ground Across Disciplines
MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources
VCRL: Variance-based Curriculum Reinforcement Learning for Large Language Models
MultiEdit: Advancing Instruction-based Image Editing on Diverse and Challenging Tasks
BRISC: Annotated Dataset for Brain Tumor Segmentation and Classification with Swin-HAFNet
EmoBench-M: Benchmarking Emotional Intelligence for Multimodal Large Language Models
FDABench: A Benchmark for Data Agents on Analytical Queries over Heterogeneous Data
Easier Painting Than Thinking: Can Text-to-Image Models Set the Stage, but Not Direct the Play?
UniVerse-1: Unified Audio-Video Generation via Stitching of Experts
How Good are Foundation Models in Step-by-Step Embodied Reasoning?
SpikingBrain Technical Report: Spiking Brain-inspired Large Models
SAGE: A Realistic Benchmark for Semantic Understanding
WAVECLIP: Wavelet Tokenization for Adaptive-Resolution CLIP
EmbeddingGemma: Powerful and Lightweight Text Representations
Advancing Speech Understanding in Speech-Aware Language Models with GRPO
How Far are VLMs from Visual Spatial Intelligence? A Benchmark-Driven Perspective
SIM-CoT: Supervised Implicit Chain-of-Thought
SWE-QA: Can Language Models Answer Repository-level Code Questions?
Video models are zero-shot learners and reasoners
An N-Plus-1 GPT Agency for Critical Solution of Mechanical Engineering Analysis Problems
Memory-QA: Answering Recall Questions Based on Multimodal Memories
EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning
MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing
Quantile Advantage Estimation for Entropy-Safe Reasoning
LongLive: Real-time Interactive Long Video Generation
Combinatorial Creativity: A New Frontier in Generalization Abilities
Causal Spatio-Temporal Prediction: An Effective and Efficient Multi-Modal Approach
Hunyuan3D-Omni: A Unified Framework for Controllable Generation of 3D Assets
Seedream 4.0: Toward Next-generation Multimodal Image Generation
Tree Search for LLM Agent Reinforcement Learning
SciReasoner: Laying the Scientific Reasoning Ground Across Disciplines
MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources
VCRL: Variance-based Curriculum Reinforcement Learning for Large Language Models
MultiEdit: Advancing Instruction-based Image Editing on Diverse and Challenging Tasks
BRISC: Annotated Dataset for Brain Tumor Segmentation and Classification with Swin-HAFNet
EmoBench-M: Benchmarking Emotional Intelligence for Multimodal Large Language Models
FDABench: A Benchmark for Data Agents on Analytical Queries over Heterogeneous Data
Easier Painting Than Thinking: Can Text-to-Image Models Set the Stage, but Not Direct the Play?
UniVerse-1: Unified Audio-Video Generation via Stitching of Experts
How Good are Foundation Models in Step-by-Step Embodied Reasoning?
SpikingBrain Technical Report: Spiking Brain-inspired Large Models
SAGE: A Realistic Benchmark for Semantic Understanding
WAVECLIP: Wavelet Tokenization for Adaptive-Resolution CLIP
EmbeddingGemma: Powerful and Lightweight Text Representations
Advancing Speech Understanding in Speech-Aware Language Models with GRPO
How Far are VLMs from Visual Spatial Intelligence? A Benchmark-Driven Perspective
SIM-CoT: Supervised Implicit Chain-of-Thought
SWE-QA: Can Language Models Answer Repository-level Code Questions?
Video models are zero-shot learners and reasoners
An N-Plus-1 GPT Agency for Critical Solution of Mechanical Engineering Analysis Problems
Memory-QA: Answering Recall Questions Based on Multimodal Memories