Command Palette
Search for a command to run...
Papers
Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

A Survey of Vibe Coding with Large Language Models

Detect Anything via Next Point Prediction































A Survey of Vibe Coding with Large Language Models

Detect Anything via Next Point Prediction






























Scaling Language-Centric Omnimodal Representation Learning
DITING: A Multi-Agent Evaluation Framework for Benchmarking Web Novel Translation
Advancing End-to-End Pixel Space Generative Modeling via Self-supervised Pre-training
Spatial Forcing: Implicit Spatial Representation Alignment for Vision-language-action Model
Asking Clarifying Questions for Preference Elicitation With Large Language Models
CTRL-Rec: Controlling Recommender Systems With Natural Language
RLFR: Extending Reinforcement Learning for LLMs with Flow Environment
Latent Refinement Decoding: Enhancing Diffusion-Based Language Models by Refining Belief States
OmniVideoBench: Towards Audio-Visual Understanding Evaluation for Omni MLLMs
BEAR: Benchmarking and Enhancing Multimodal Language Models for Atomic Embodied Capabilities
Diffusion Transformers with Representation Autoencoders
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs
Inverse-Free Wilson Loops for Transformers: A Practical Diagnostic for Invariance and Order Sensitivity
TUMIX: Multi-Agent Test-Time Scaling with Tool-Use Mixture
R-Horizon: How Far Can Your Large Reasoning Model Really Go in Breadth and Depth?
AutoPR: Let's Automate Your Academic Promotion!
Multimodal Prompt Optimization: Why Not Leverage Multiple Modalities for MLLMs
TAG:Tangential Amplifying Guidance for Hallucination-Resistant Diffusion Sampling
Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation
D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI
Code2Video: A Code-centric Paradigm for Educational Video Generation
Dr. Bias: Social Disparities in AI-Powered Medical Guidance
The Potential of Second-Order Optimization for LLMs: A Study with Full Gauss-Newton
Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement Learning
From What to Why: A Multi-Agent System for Evidence-based Chemical Reaction Condition Reasoning
DreamOmni2: Multimodal Instruction-based Editing and Generation
VideoCanvas: Unified Video Completion from Arbitrary Spatiotemporal Patches via In-Context Conditioning
UniVideo: Unified Understanding, Generation, and Editing for Videos
MemMamba: Rethinking Memory Patterns in State Space Model
MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization
Scaling Language-Centric Omnimodal Representation Learning
DITING: A Multi-Agent Evaluation Framework for Benchmarking Web Novel Translation
Advancing End-to-End Pixel Space Generative Modeling via Self-supervised Pre-training
Spatial Forcing: Implicit Spatial Representation Alignment for Vision-language-action Model
Asking Clarifying Questions for Preference Elicitation With Large Language Models
CTRL-Rec: Controlling Recommender Systems With Natural Language
RLFR: Extending Reinforcement Learning for LLMs with Flow Environment
Latent Refinement Decoding: Enhancing Diffusion-Based Language Models by Refining Belief States
OmniVideoBench: Towards Audio-Visual Understanding Evaluation for Omni MLLMs
BEAR: Benchmarking and Enhancing Multimodal Language Models for Atomic Embodied Capabilities
Diffusion Transformers with Representation Autoencoders
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs
Inverse-Free Wilson Loops for Transformers: A Practical Diagnostic for Invariance and Order Sensitivity
TUMIX: Multi-Agent Test-Time Scaling with Tool-Use Mixture
R-Horizon: How Far Can Your Large Reasoning Model Really Go in Breadth and Depth?
AutoPR: Let's Automate Your Academic Promotion!
Multimodal Prompt Optimization: Why Not Leverage Multiple Modalities for MLLMs
TAG:Tangential Amplifying Guidance for Hallucination-Resistant Diffusion Sampling
Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation
D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI
Code2Video: A Code-centric Paradigm for Educational Video Generation
Dr. Bias: Social Disparities in AI-Powered Medical Guidance
The Potential of Second-Order Optimization for LLMs: A Study with Full Gauss-Newton
Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement Learning
From What to Why: A Multi-Agent System for Evidence-based Chemical Reaction Condition Reasoning
DreamOmni2: Multimodal Instruction-based Editing and Generation
VideoCanvas: Unified Video Completion from Arbitrary Spatiotemporal Patches via In-Context Conditioning
UniVideo: Unified Understanding, Generation, and Editing for Videos
MemMamba: Rethinking Memory Patterns in State Space Model
MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization