HyperAI

Main

GPU

Console
Docs
Pricing

Pulse

News

Resources

Papers
Notebooks
Datasets
Wiki

Benchmarks

SOTA
LLM Models
GPU Leaderboard

Community

Events

Utility

About Terms of Service Privacy Policy
English

Command Palette

Search for a command to run...

HyperAI
Papers

Papers

Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

Build the Future of Artificial Intelligence

About

About Us Dataset Help

Products

News Papers Notebooks Datasets Wiki

Links

© HyperAI

GitHub Discord X (formerly Twitter)

HyperAI

Main

GPU

Console
Docs
Pricing

Pulse

News

Resources

Papers
Notebooks
Datasets
Wiki

Benchmarks

SOTA
LLM Models
GPU Leaderboard

Community

Events

Utility

About Terms of Service Privacy Policy
English

Command Palette

Search for a command to run...

HyperAI
Papers

Papers

Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

Build the Future of Artificial Intelligence

About

About Us Dataset Help

Products

News Papers Notebooks Datasets Wiki

Links

© HyperAI

GitHub Discord X (formerly Twitter)

VITA-E: Natural Embodied Interaction with Concurrent Seeing, Hearing,
Speaking, and Acting

VITA-E: Natural Embodied Interaction with Concurrent Seeing, Hearing, Speaking, and Acting

Embodied Intelligence

Xiaoyu Liu, Chaoyou Fu, Chi Yan, et al.

FARMER: Flow AutoRegressive Transformer over Pixels

FARMER: Flow AutoRegressive Transformer over Pixels

Image Generation

Guangting Zheng, Qinyu Zhao, Tao Yang, et al.

A Survey of Data Agents: Emerging Paradigm or Overstated Hype?

Yizhang Zhu, Liangwei Wang, Chenyu Yang, et al.

ReCode: Unify Plan and Action for Universal Granularity Control

Code Generation

Zhaoyang Yu, Jiayi Zhang, Huixue Su, et al.

Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial
Representations

Multimodal Representation

Computer Vision

Yujia Zhang, Xiaoyang Wu, Yixing Lao, et al.

Magellan: Guided MCTS for Latent Space Exploration and Novelty Generation

Text Generation

DEEDEE: Fast and Scalable Out-of-Distribution Dynamics Detection

Reinforcement Learning

Tala Aljaafari, Varun Kanade, Philip Torr, et al.

Sparser Block-Sparse Attention via Token Permutation

Xinghao Wang, Pengyu Wang, Dong Zhang, et al.

A Definition of AGI

Dan Hendrycks, Dawn Song, Christian Szegedy, et al.

From Denoising to Refining: A Corrective Framework for Vision-Language Diffusion Model

Diffusion Model

Yatai Ji, Teng Wang, Yuying Ge, et al.

Sample By Step, Optimize By Chunk: Chunk-Level GRPO For Text-to-Image
Generation

Image Generation

Yifu Luo, Penghui Du, Bo Li, et al.

Video-As-Prompt: Unified Semantic Control for Video Generation

Video Generation

Yuxuan Bian, Xin Chen, Zenan Li, et al.

DeepAgent: A General Reasoning Agent with Scalable Toolsets

Xiaoxi Li, Wenxiang Jiao, Jiarui Jin, et al.

Uncertainty-Aware Multi-Objective Reinforcement Learning-Guided Diffusion Models for 3D De Novo Molecular Design

Diffusion Model

Reinforcement Learning

Lianghong Chen, Dongkyu Eugene Kim, Mike Domaratzki, et al.

Reac-Discovery: an artificial intelligence–driven platform for continuous-flow catalytic reactor discovery and optimization

Cristopher Tinajero, Marcileia Zanatta, Julián E. Sánchez-Velandia, et al.

BoltzGen:Toward Universal Binder Design

Hannes Stark, Felix Faltings, MinGyu Choi, et al.

HSCodeComp: A Realistic and Expert-level Benchmark for Deep Search Agents in Hierarchical Rule Application

Yiqian Yang, Tian Lan, Qianghuai Jia, et al.

DyPE: Dynamic Position Extrapolation for Ultra High Resolution Diffusion

Diffusion Model

Noam Issachar, Guy Yariv, Sagie Benaim, et al.

HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video
Narratives

Video Generation

Yihao Meng, Hao Ouyang, Yue Yu, et al.

Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal
Evidence

Video Understanding

Jiahao Meng, Xiangtai Li, Haochen Wang, et al.

AdaSPEC: Selective Knowledge Distillation for Efficient Speculative Decoders

Yuezhou Hu, Jiaxin Guo, Xinyu Feng, et al.

Human-Agent Collaborative Paper-to-Page Crafting for Under $0.1

Qianli Ma, Siyu Wang, Yilin Chen, et al.

Ling Xing, Alex Jinpeng Wang, Rui Yan, et al.

Visual Question Answering

Chao Huang, Zeliang Zhang, Jiang Liu, et al.

Language Models are Injective and Hence Invertible

Natural Language Processing

Giorgos Nikolaou, Tommaso Mencattini, Donato Crisostomi, et al.

The Free Transformer

François Fleuret

Quantum Processing Unit (QPU) processing time Prediction with Machine Learning

Machine Learning

Lucy Xing, Sanjay Vishwakarma, David Kremer, et al.

Observation of constructive interference at the edge of quantum ergodicity

Google Quantum AI and Collaborators

VideoAgentTrek: Computer Use Pretraining from Unlabeled Videos

Action Recognition

Human-Computer Interaction

Dunjie Lu, Yiheng Xu, Junli Wang, et al.

GigaBrain-0: A World Model-Powered Vision-Language-Action Model

Embodied Intelligence

GigaBrain Team, Angen Ye, Boyuan Wang, et al.

LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts

Siyuan Wang, Gaokai Zhang, Li Lyna Zhang, et al.

BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via
Balanced Policy Optimization with Adaptive Clipping

Reinforcement Learning

Zhiheng Xi, Xin Guo, Yang Nan, et al.

VITA-E: Natural Embodied Interaction with Concurrent Seeing, Hearing,
Speaking, and Acting

VITA-E: Natural Embodied Interaction with Concurrent Seeing, Hearing, Speaking, and Acting

Embodied Intelligence

Xiaoyu Liu, Chaoyou Fu, Chi Yan, et al.

FARMER: Flow AutoRegressive Transformer over Pixels

FARMER: Flow AutoRegressive Transformer over Pixels

Image Generation

Guangting Zheng, Qinyu Zhao, Tao Yang, et al.

A Survey of Data Agents: Emerging Paradigm or Overstated Hype?

Yizhang Zhu, Liangwei Wang, Chenyu Yang, et al.

ReCode: Unify Plan and Action for Universal Granularity Control

Code Generation

Zhaoyang Yu, Jiayi Zhang, Huixue Su, et al.

Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial
Representations

Multimodal Representation

Computer Vision

Yujia Zhang, Xiaoyang Wu, Yixing Lao, et al.

Magellan: Guided MCTS for Latent Space Exploration and Novelty Generation

Text Generation

DEEDEE: Fast and Scalable Out-of-Distribution Dynamics Detection

Reinforcement Learning

Tala Aljaafari, Varun Kanade, Philip Torr, et al.

Sparser Block-Sparse Attention via Token Permutation

Xinghao Wang, Pengyu Wang, Dong Zhang, et al.

A Definition of AGI

Dan Hendrycks, Dawn Song, Christian Szegedy, et al.

From Denoising to Refining: A Corrective Framework for Vision-Language Diffusion Model

Diffusion Model

Yatai Ji, Teng Wang, Yuying Ge, et al.

Sample By Step, Optimize By Chunk: Chunk-Level GRPO For Text-to-Image
Generation

Image Generation

Yifu Luo, Penghui Du, Bo Li, et al.

Video-As-Prompt: Unified Semantic Control for Video Generation

Video Generation

Yuxuan Bian, Xin Chen, Zenan Li, et al.

DeepAgent: A General Reasoning Agent with Scalable Toolsets

Xiaoxi Li, Wenxiang Jiao, Jiarui Jin, et al.

Uncertainty-Aware Multi-Objective Reinforcement Learning-Guided Diffusion Models for 3D De Novo Molecular Design

Diffusion Model

Reinforcement Learning

Lianghong Chen, Dongkyu Eugene Kim, Mike Domaratzki, et al.

Reac-Discovery: an artificial intelligence–driven platform for continuous-flow catalytic reactor discovery and optimization

Cristopher Tinajero, Marcileia Zanatta, Julián E. Sánchez-Velandia, et al.

BoltzGen:Toward Universal Binder Design

Hannes Stark, Felix Faltings, MinGyu Choi, et al.

HSCodeComp: A Realistic and Expert-level Benchmark for Deep Search Agents in Hierarchical Rule Application

Yiqian Yang, Tian Lan, Qianghuai Jia, et al.

DyPE: Dynamic Position Extrapolation for Ultra High Resolution Diffusion

Diffusion Model

Noam Issachar, Guy Yariv, Sagie Benaim, et al.

HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video
Narratives

Video Generation

Yihao Meng, Hao Ouyang, Yue Yu, et al.

Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal
Evidence

Video Understanding

Jiahao Meng, Xiangtai Li, Haochen Wang, et al.

AdaSPEC: Selective Knowledge Distillation for Efficient Speculative Decoders

Yuezhou Hu, Jiaxin Guo, Xinyu Feng, et al.

Human-Agent Collaborative Paper-to-Page Crafting for Under $0.1

Qianli Ma, Siyu Wang, Yilin Chen, et al.

Ling Xing, Alex Jinpeng Wang, Rui Yan, et al.

Visual Question Answering

Chao Huang, Zeliang Zhang, Jiang Liu, et al.

Language Models are Injective and Hence Invertible

Natural Language Processing

Giorgos Nikolaou, Tommaso Mencattini, Donato Crisostomi, et al.

The Free Transformer

François Fleuret

Quantum Processing Unit (QPU) processing time Prediction with Machine Learning

Machine Learning

Lucy Xing, Sanjay Vishwakarma, David Kremer, et al.

Observation of constructive interference at the edge of quantum ergodicity

Google Quantum AI and Collaborators

VideoAgentTrek: Computer Use Pretraining from Unlabeled Videos

Action Recognition

Human-Computer Interaction

Dunjie Lu, Yiheng Xu, Junli Wang, et al.

GigaBrain-0: A World Model-Powered Vision-Language-Action Model

Embodied Intelligence

GigaBrain Team, Angen Ye, Boyuan Wang, et al.

LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts

Siyuan Wang, Gaokai Zhang, Li Lyna Zhang, et al.

BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via
Balanced Policy Optimization with Adaptive Clipping

Reinforcement Learning

Zhiheng Xi, Xin Guo, Yang Nan, et al.

A Survey of Data Agents: Emerging Paradigm or Overstated Hype?

ReCode: Unify Plan and Action for Universal Granularity Control

Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations

Magellan: Guided MCTS for Latent Space Exploration and Novelty Generation

DEEDEE: Fast and Scalable Out-of-Distribution Dynamics Detection

Sparser Block-Sparse Attention via Token Permutation

A Definition of AGI

From Denoising to Refining: A Corrective Framework for Vision-Language Diffusion Model

Sample By Step, Optimize By Chunk: Chunk-Level GRPO For Text-to-Image Generation

Video-As-Prompt: Unified Semantic Control for Video Generation

DeepAgent: A General Reasoning Agent with Scalable Toolsets

Uncertainty-Aware Multi-Objective Reinforcement Learning-Guided Diffusion Models for 3D De Novo Molecular Design

Reac-Discovery: an artificial intelligence–driven platform for continuous-flow catalytic reactor discovery and optimization

BoltzGen:Toward Universal Binder Design

HSCodeComp: A Realistic and Expert-level Benchmark for Deep Search Agents in Hierarchical Rule Application

DyPE: Dynamic Position Extrapolation for Ultra High Resolution Diffusion

HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives

Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence

AdaSPEC: Selective Knowledge Distillation for Efficient Speculative Decoders

Human-Agent Collaborative Paper-to-Page Crafting for Under $0.1

See the Text: From Tokenization to Visual Reading

Directional Reasoning Injection for Fine-Tuning MLLMs

Language Models are Injective and Hence Invertible

The Free Transformer

Quantum Processing Unit (QPU) processing time Prediction with Machine Learning

Observation of constructive interference at the edge of quantum ergodicity

VideoAgentTrek: Computer Use Pretraining from Unlabeled Videos

GigaBrain-0: A World Model-Powered Vision-Language-Action Model

LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts

BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping

A Survey of Data Agents: Emerging Paradigm or Overstated Hype?

ReCode: Unify Plan and Action for Universal Granularity Control

Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations

Magellan: Guided MCTS for Latent Space Exploration and Novelty Generation

DEEDEE: Fast and Scalable Out-of-Distribution Dynamics Detection

Sparser Block-Sparse Attention via Token Permutation

A Definition of AGI

From Denoising to Refining: A Corrective Framework for Vision-Language Diffusion Model

Sample By Step, Optimize By Chunk: Chunk-Level GRPO For Text-to-Image Generation

Video-As-Prompt: Unified Semantic Control for Video Generation

DeepAgent: A General Reasoning Agent with Scalable Toolsets

Uncertainty-Aware Multi-Objective Reinforcement Learning-Guided Diffusion Models for 3D De Novo Molecular Design

Reac-Discovery: an artificial intelligence–driven platform for continuous-flow catalytic reactor discovery and optimization

BoltzGen:Toward Universal Binder Design

HSCodeComp: A Realistic and Expert-level Benchmark for Deep Search Agents in Hierarchical Rule Application

DyPE: Dynamic Position Extrapolation for Ultra High Resolution Diffusion

HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives

Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence

AdaSPEC: Selective Knowledge Distillation for Efficient Speculative Decoders

Human-Agent Collaborative Paper-to-Page Crafting for Under $0.1

See the Text: From Tokenization to Visual Reading

Directional Reasoning Injection for Fine-Tuning MLLMs

Language Models are Injective and Hence Invertible

The Free Transformer

Quantum Processing Unit (QPU) processing time Prediction with Machine Learning

Observation of constructive interference at the edge of quantum ergodicity

VideoAgentTrek: Computer Use Pretraining from Unlabeled Videos

GigaBrain-0: A World Model-Powered Vision-Language-Action Model

LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts

BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping