Command Palette
Search for a command to run...
Papers
Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning

A Survey of Context Engineering for Large Language Models































VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning

A Survey of Context Engineering for Large Language Models






























Assessing adaptive world models in machines with novel games
Emotional Support with LLM-based Empathetic Dialogue Generation
DrafterBench: Benchmarking Large Language Models for Tasks Automation in Civil Engineering
SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories?
MOSPA: Human Motion Generation Driven by Spatial Audio
MMHU: A Massive-Scale Multimodal Benchmark for Human Behavior Understanding
PhysX: Physical-Grounded 3D Asset Generation
Towards Agentic RAG with Deep Reasoning: A Survey of RAG-Reasoning Systems in LLMs
La-Proteina: Atomistic Protein Generation via Partially Latent Flow Matching
SUICA: Learning Super-high Dimensional Sparse Implicit Neural Representations for Spatial Transcriptomics
XiChen: An observation-scalable fully AI-driven global weather forecasting system with 4D variational knowledge
AgentsNet: Coordination and Collaborative Reasoning in Multi-Agent LLMs
Can Multimodal Foundation Models Understand Schematic Diagrams? An Empirical Study on Information-Seeking QA over Scientific Papers
Scaling Laws for Optimal Data Mixtures
Subject-Consistent and Pose-Diverse Text-to-Image Generation
Vision-Language-Vision Auto-Encoder: Scalable Knowledge Distillation from Diffusion Models
DuetGraph: Coarse-to-Fine Knowledge Graph Reasoning with Dual-Pathway Global-Local Fusion
CogDDN: A Cognitive Demand-Driven Navigation with Decision Optimization and Dual-Process Thinking
LayerCake: Token-Aware Contrastive Decoding within Large Language Model Layers
Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation
REST: Stress Testing Large Reasoning Models by Asking Multiple Problems at Once
EmbRACE-3K: Embodied Reasoning and Action in Complex Environments
Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination
SpeakerVid-5M: A Large-Scale High-Quality Dataset for Audio-Visual Dyadic Interactive Human Generation
VerifyBench: A Systematic Benchmark for Evaluating Reasoning Verifiers Across Domains
Sidechain conditioning and modeling for full-atom protein sequence design with FAMPNN
One Token to Fool LLM-as-a-Judge
From One to More: Contextual Part Latents for 3D Generation
Open Vision Reasoner: Transferring Linguistic Cognitive Behavior for Visual Reasoning
Lumos-1: On Autoregressive Video Generation from a Unified Model Perspective
Assessing adaptive world models in machines with novel games
Emotional Support with LLM-based Empathetic Dialogue Generation
DrafterBench: Benchmarking Large Language Models for Tasks Automation in Civil Engineering
SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories?
MOSPA: Human Motion Generation Driven by Spatial Audio
MMHU: A Massive-Scale Multimodal Benchmark for Human Behavior Understanding
PhysX: Physical-Grounded 3D Asset Generation
Towards Agentic RAG with Deep Reasoning: A Survey of RAG-Reasoning Systems in LLMs
La-Proteina: Atomistic Protein Generation via Partially Latent Flow Matching
SUICA: Learning Super-high Dimensional Sparse Implicit Neural Representations for Spatial Transcriptomics
XiChen: An observation-scalable fully AI-driven global weather forecasting system with 4D variational knowledge
AgentsNet: Coordination and Collaborative Reasoning in Multi-Agent LLMs
Can Multimodal Foundation Models Understand Schematic Diagrams? An Empirical Study on Information-Seeking QA over Scientific Papers
Scaling Laws for Optimal Data Mixtures
Subject-Consistent and Pose-Diverse Text-to-Image Generation
Vision-Language-Vision Auto-Encoder: Scalable Knowledge Distillation from Diffusion Models
DuetGraph: Coarse-to-Fine Knowledge Graph Reasoning with Dual-Pathway Global-Local Fusion
CogDDN: A Cognitive Demand-Driven Navigation with Decision Optimization and Dual-Process Thinking
LayerCake: Token-Aware Contrastive Decoding within Large Language Model Layers
Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation
REST: Stress Testing Large Reasoning Models by Asking Multiple Problems at Once
EmbRACE-3K: Embodied Reasoning and Action in Complex Environments
Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination
SpeakerVid-5M: A Large-Scale High-Quality Dataset for Audio-Visual Dyadic Interactive Human Generation
VerifyBench: A Systematic Benchmark for Evaluating Reasoning Verifiers Across Domains
Sidechain conditioning and modeling for full-atom protein sequence design with FAMPNN
One Token to Fool LLM-as-a-Judge
From One to More: Contextual Part Latents for 3D Generation
Open Vision Reasoner: Transferring Linguistic Cognitive Behavior for Visual Reasoning
Lumos-1: On Autoregressive Video Generation from a Unified Model Perspective