Command Palette
Search for a command to run...
Papers
Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

ASTRA: Automated Synthesis of agentic Trajectories and Reinforcement Arenas

Self-Distillation Enables Continual Learning































ASTRA: Automated Synthesis of agentic Trajectories and Reinforcement Arenas

Self-Distillation Enables Continual Learning






























Towards Execution-Grounded Automated AI Research
DynamicVLA: A Vision-Language-Action Model for Dynamic Object Manipulation
MMFineReason: Closing the Multimodal Reasoning Gap via Open Data-Centric Methods
OCRVerse: Towards Holistic OCR in End-to-End Vision-Language Models
Scaling Embeddings Outperforms Scaling Experts in Language Models
Idea2Story: An Automated Pipeline for Transforming Research Concepts into Complete Scientific Narratives
Everything in Its Place: Benchmarking Spatial Intelligence of Text-to-Image Models
Qwen3-ASR Technical Report
Insight Agents: An LLM-Based Multi-Agent System for Data Insights
Towards Pixel-Level VLM Perception via Simple Points Prediction
Youtu-VL: Unleashing Visual Potential via Unified Vision-Language Supervision
Innovator-VL: A Multimodal Large Language Model for Scientific Discovery
Advancing Open-source World Models
Harder Is Better: Boosting Mathematical Reasoning via Difficulty-Aware GRPO and Multi-Aspect Question Reformulation
Short window attention enables long-term memorization
World Craft: Agentic Framework to Create Visualizable Worlds via Text
Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models
Masked Depth Modeling for Spatial Perception
A Pragmatic VLA Foundation Model
AdaReasoner: Dynamic Tool Orchestration for Iterative Visual Reasoning
AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security
ARCEE TRINITY LARGE TECHNICAL REPORT
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability
ATLAS: Adaptive Transfer Scaling Laws for Multilingual Pretraining, Finetuning, and Decoding the Curse of Multilinguality
iFSQ: Improving FSQ for Image Generation with 1 Line of Code
Elastic Attention: Test-time Adaptive Sparsity Ratios for Efficient Transformers
Scientific Image Synthesis: Benchmarking, Methodologies, and Downstream Utility
The Script is All You Need: An Agentic Framework for Long-Horizon Dialogue-to-Cinematic Video Generation
daVinci-Dev: Agent-native Mid-training for Software Engineering
Can LLMs Clean Up Your Mess? A Survey of Application-Ready Data Preparation with LLMs
Towards Execution-Grounded Automated AI Research
DynamicVLA: A Vision-Language-Action Model for Dynamic Object Manipulation
MMFineReason: Closing the Multimodal Reasoning Gap via Open Data-Centric Methods
OCRVerse: Towards Holistic OCR in End-to-End Vision-Language Models
Scaling Embeddings Outperforms Scaling Experts in Language Models
Idea2Story: An Automated Pipeline for Transforming Research Concepts into Complete Scientific Narratives
Everything in Its Place: Benchmarking Spatial Intelligence of Text-to-Image Models
Qwen3-ASR Technical Report
Insight Agents: An LLM-Based Multi-Agent System for Data Insights
Towards Pixel-Level VLM Perception via Simple Points Prediction
Youtu-VL: Unleashing Visual Potential via Unified Vision-Language Supervision
Innovator-VL: A Multimodal Large Language Model for Scientific Discovery
Advancing Open-source World Models
Harder Is Better: Boosting Mathematical Reasoning via Difficulty-Aware GRPO and Multi-Aspect Question Reformulation
Short window attention enables long-term memorization
World Craft: Agentic Framework to Create Visualizable Worlds via Text
Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models
Masked Depth Modeling for Spatial Perception
A Pragmatic VLA Foundation Model
AdaReasoner: Dynamic Tool Orchestration for Iterative Visual Reasoning
AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security
ARCEE TRINITY LARGE TECHNICAL REPORT
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability
ATLAS: Adaptive Transfer Scaling Laws for Multilingual Pretraining, Finetuning, and Decoding the Curse of Multilinguality
iFSQ: Improving FSQ for Image Generation with 1 Line of Code
Elastic Attention: Test-time Adaptive Sparsity Ratios for Efficient Transformers
Scientific Image Synthesis: Benchmarking, Methodologies, and Downstream Utility
The Script is All You Need: An Agentic Framework for Long-Horizon Dialogue-to-Cinematic Video Generation
daVinci-Dev: Agent-native Mid-training for Software Engineering
Can LLMs Clean Up Your Mess? A Survey of Application-Ready Data Preparation with LLMs