Command Palette
Search for a command to run...
Papers
Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

PlayerOne: Egocentric World Simulator

ComfyUI-R1: Exploring Reasoning Models for Workflow Generation































PlayerOne: Egocentric World Simulator

ComfyUI-R1: Exploring Reasoning Models for Workflow Generation






























Autoregressive Adversarial Post-Training for Real-Time Interactive Video Generation
Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models
On the Use of Self-Supervised Speech Representations in Spontaneous Speech Synthesis
Efficient Machine Learning Force Field for Large-Scale Molecular Simulations of Organic Systems
vLLM Hook v0: A Plug-in for Programming Model Internals on vLLM
MIRAGE: Retrieval and Generation of Multimodal Images and Texts for Medical Education
ComfyGPT: A Self-Optimizing Multi-Agent System for Comprehensive ComfyUI Workflow Generation
Sequence Model Design for Code Completion in the Modern IDE
ACE-Step: A Step Towards Music Generation Foundation Model
Zero-shot text-to-speech synthesis conditioned using self-supervised speech representation model
Can Small and Reasoning Large Language Models Score Journal Articles for Research Quality and Do Averaging and Few-shot Help?
A Flexible and Secure Deployment Framework for Distributed Applications
Multimodal Pretraining and Generation for Recommendation: A Tutorial
A Theoretical Limit to Physicalism: A Non-Technical Explanation of the Gemini Theorem
EmoSSLSphere: Multilingual Emotional Speech Synthesis with Spherical Vectors and Discrete Speech Tokens
Propagation dynamics of the circular Airy Gaussian vortex beams in the fractional nonlinear Schrödinger equation
VASP on a GPU: application to exact-exchange calculations of the stability of elemental boron
Information quantity in a pixel of digital image
AR-RAG: Autoregressive Retrieval Augmentation for Image Generation
Recognition of Handwritten Roman Script Using Tesseract Open Source OCR Engine
TimeSenCLIP: A Time Series Vision-Language Model for Remote Sensing
Learning Temporal Evolution of Spatial Dependence with Generalized Spatiotemporal Gaussian Process Models
TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models
Qwen2.5-Omni Technical Report
Dual-Scale Single Image Dehazing Via Neural Augmentation
SpeechPrompt: Prompting Speech Language Models for Speech Processing Tasks
Music SketchNet: Controllable Music Generation via Factorized Representations of Pitch and Rhythm
Adaptive Data Flywheel: Applying MAPE Control Loops to AI Agent Improvement
Expand VSR Benchmark for VLLM to Expertize in Spatial Rules
DensityTool: A post-processing tool for space and spin-resolved density of states from VASP
Autoregressive Adversarial Post-Training for Real-Time Interactive Video Generation
Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models
On the Use of Self-Supervised Speech Representations in Spontaneous Speech Synthesis
Efficient Machine Learning Force Field for Large-Scale Molecular Simulations of Organic Systems
vLLM Hook v0: A Plug-in for Programming Model Internals on vLLM
MIRAGE: Retrieval and Generation of Multimodal Images and Texts for Medical Education
ComfyGPT: A Self-Optimizing Multi-Agent System for Comprehensive ComfyUI Workflow Generation
Sequence Model Design for Code Completion in the Modern IDE
ACE-Step: A Step Towards Music Generation Foundation Model
Zero-shot text-to-speech synthesis conditioned using self-supervised speech representation model
Can Small and Reasoning Large Language Models Score Journal Articles for Research Quality and Do Averaging and Few-shot Help?
A Flexible and Secure Deployment Framework for Distributed Applications
Multimodal Pretraining and Generation for Recommendation: A Tutorial
A Theoretical Limit to Physicalism: A Non-Technical Explanation of the Gemini Theorem
EmoSSLSphere: Multilingual Emotional Speech Synthesis with Spherical Vectors and Discrete Speech Tokens
Propagation dynamics of the circular Airy Gaussian vortex beams in the fractional nonlinear Schrödinger equation
VASP on a GPU: application to exact-exchange calculations of the stability of elemental boron
Information quantity in a pixel of digital image
AR-RAG: Autoregressive Retrieval Augmentation for Image Generation
Recognition of Handwritten Roman Script Using Tesseract Open Source OCR Engine
TimeSenCLIP: A Time Series Vision-Language Model for Remote Sensing
Learning Temporal Evolution of Spatial Dependence with Generalized Spatiotemporal Gaussian Process Models
TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models
Qwen2.5-Omni Technical Report
Dual-Scale Single Image Dehazing Via Neural Augmentation
SpeechPrompt: Prompting Speech Language Models for Speech Processing Tasks
Music SketchNet: Controllable Music Generation via Factorized Representations of Pitch and Rhythm
Adaptive Data Flywheel: Applying MAPE Control Loops to AI Agent Improvement
Expand VSR Benchmark for VLLM to Expertize in Spatial Rules
DensityTool: A post-processing tool for space and spin-resolved density of states from VASP