HyperAI

Main

GPU

Console
Studio
Docs
Pricing

Pulse

News

Resources

Papers
Notebooks
Datasets
Wiki

Benchmarks

SOTA
LLM Models
GPU Leaderboard

Community

Events

Utility

About Terms of Service Privacy Policy
English

Command Palette

Search for a command to run...

HyperAI
Papers

Papers

Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

Build the Future of Artificial Intelligence

About

About Us Support Dataset Help

Products

News Papers Notebooks Datasets Wiki

Links

© HyperAI

GitHub Discord X (formerly Twitter)

HyperAI

Main

GPU

Console
Studio
Docs
Pricing

Pulse

News

Resources

Papers
Notebooks
Datasets
Wiki

Benchmarks

SOTA
LLM Models
GPU Leaderboard

Community

Events

Utility

About Terms of Service Privacy Policy
English

Command Palette

Search for a command to run...

HyperAI
Papers

Papers

Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

Build the Future of Artificial Intelligence

About

About Us Support Dataset Help

Products

News Papers Notebooks Datasets Wiki

Links

© HyperAI

GitHub Discord X (formerly Twitter)

DSpark: Confidence-Scheduled Speculative Decoding with Semi-Autoregressive Generation

DSpark: Confidence-Scheduled Speculative Decoding with Semi-Autoregressive Generation

Text Generation

Xin Cheng, Xingkai Yu, Chenze Shao, et al.

ViQ: Text-Aligned Visual Quantized Representations at Any Resolution

ViQ: Text-Aligned Visual Quantized Representations at Any Resolution

Multimodal Representation

Xumin Yu, Zuyan Liu, Zhenyu Yang, et al.

The Verification Horizon: No Silver Bullet for Coding Agent Rewards

Code Generation

Binghai Wang, Chenlong Zhang, Dayiheng Liu, et al.

Qwen-Image-Agent: Bridging the Context Gap in Real-World Image Generation

Image Generation

Zekai Zhang, Jiahao Li, Jie Zhang, et al.

OPID: On-Policy Skill Distillation for Agentic Reinforcement Learning

Reinforcement Learning

Shuo Yang, Jinyang Wu, Zhengxi Lu, et al.

In-Context World Modeling for Robotic Control

Siyin Wang, Junhao Shi, Senyu Fei, et al.

DanceOPD: On-Policy Generative Field Distillation

Image Generation

Wei Zhou, Xiongwei Zhu, Zelin Xu, et al.

Autodata: An agentic data scientist to create high quality synthetic data

Supervised Fine-Tuning

Ilia Kulikov, Chenxi Whitehouse, Tianhao Wu, et al.

Improved Large Language Diffusion Models

Diffusion Model

Text Generation

Shen Nie, Qiyang Min, Shaoxuan Xu, et al.

How Robust is OCR-Reasoning? Evaluating OCR-Reasoning Robustness of Vision-Language Models under Visual Perturbations

Document Understanding

Yuxing Cheng, Yuan Wu, Yi Chang

RoboAtlas: Contextual Active SLAM

3D Machine Vision

Alexander Schperberg, Shivam K. Panda, Abraham P. Vinod, et al.

Learning Robot Visual Navigation in Crowds via Intention-Aware Scene Representations

Action Recognition

Han Bao, Bingyi Xia, Hanjing Ye, et al.

Deep Reinforcement Learning-Enhanced Event-Triggered Data-Driven Predictive Control for a 3D Cable-Driven Soft Robotic Arm

Reinforcement Learning

Cheng Ouyang, Moeen Ul Islam, Kaixiang Zhang, et al.

Natural Ungrokking: Asymmetric Control of Which Rules Survive Pretraining

Juliana Li, Diya Sreedhar

Every Nonnegative Integer Is a Sum of a Triangular, a Pentagonal, and a Heptagonal Number

Yichuan Cao, Dakai Guo, Ruichen Qiu, et al.

Loop Engineering: The Anthropic Playbook for Designing Systems That Prompt Your Agents

Peter Steinberger, Boris Cherny, Addy Osmani

Small LLMs: Pruning vs. Training from Scratch

Yufeng Xu, Taiming Lu, Kunjun Li, et al.

OpenThoughts-Agent: Data Recipes for Agentic Models

Negin Raoof, Richard Zhuang, Marianna Nezhurina, et al.

LingxiDiagBench: A Multi-Agent Framework for Benchmarking LLMs in Chinese Psychiatric Consultation and Diagnosis

Shihao Xu, Tiancheng Zhou, Jiatong Ma, et al.

AOHP: An Open-Source OS-Level Agent Harness for Personalized, Efficient and Secure Interaction

Shanhui Zhao, Jiacheng Liu, Guohong Liu, et al.

MemGUI-Agent: An End-to-End Long-Horizon Mobile GUI Agent with Proactive Context Management

Guangyi Liu, Gao Wu, Congxiao Liu, et al.

MobileForge: Annotation-Free Adaptation for Mobile GUI Agents with Hierarchical Feedback-Guided Policy Optimization

Guangyi Liu, Pengxiang Zhao, Gao Wu, et al.

NatureBench: Can Coding Agents Match the Published SOTA of Nature-Family Papers?

Yuru Wang, Lejun Cheng, Yuxin Zuo, et al.

Qwen-AgentWorld: Language World Models for General Agents

Yuxin Zuo, Zikai Xiao, Li Sheng, et al.

Rethinking Training Targets, Architectures and Data Quality for Universal Speech Enhancement

Audio and Speech Processing

Szu-Wei Fu, Rong Chao, Xuesong Yang, et al.

Generative 3D Gaussians with Learned Density Control

Diffusion Model

Runjie Yan, Yan-Pei Cao, Peng Wang, et al.

TADA: A Generative Framework for Speech Modeling via Text-Acoustic Dual Alignment

Trung Dang, Sharath Rao, Ananya Gupta, et al.

Beyond Isolated Words: Diffusion Brush for Handwritten Text-Line Generation

Diffusion Model

Image Generation

Gang Dai, Yifan Zhang, Yutao Qin, et al.

gsplat: An Open-Source Library for Gaussian Splatting

Vickie Ye, Ruilong Li, Justin Kerr, et al.

OmniVideo-100K: A Dataset for Audio-Visual Reasoning through Structured Scripts and Evidence Chains

Video Understanding

Visual Question Answering

Xinyue Cai, Chaoyou Fu, Yi-Fan Zhang, et al.

OPEN-SWE-TRACES: Advancing Dual-Mode Multilingual Distillation for Software Engineering Agents

Code Generation

Text Generation

Wasi Uddin Ahmad, Nikolai Ludwig, Somshubra Majumdar, et al.

Credit Assignment with Resets in Language Model Reasoning

Reinforcement Learning

Ankur Samanta, Akshayaa Magesh, Ayush Jain, et al.

DSpark: Confidence-Scheduled Speculative Decoding with Semi-Autoregressive Generation

DSpark: Confidence-Scheduled Speculative Decoding with Semi-Autoregressive Generation

Text Generation

Xin Cheng, Xingkai Yu, Chenze Shao, et al.

ViQ: Text-Aligned Visual Quantized Representations at Any Resolution

ViQ: Text-Aligned Visual Quantized Representations at Any Resolution

Multimodal Representation

Xumin Yu, Zuyan Liu, Zhenyu Yang, et al.

The Verification Horizon: No Silver Bullet for Coding Agent Rewards

Code Generation

Binghai Wang, Chenlong Zhang, Dayiheng Liu, et al.

Qwen-Image-Agent: Bridging the Context Gap in Real-World Image Generation

Image Generation

Zekai Zhang, Jiahao Li, Jie Zhang, et al.

OPID: On-Policy Skill Distillation for Agentic Reinforcement Learning

Reinforcement Learning

Shuo Yang, Jinyang Wu, Zhengxi Lu, et al.

In-Context World Modeling for Robotic Control

Siyin Wang, Junhao Shi, Senyu Fei, et al.

DanceOPD: On-Policy Generative Field Distillation

Image Generation

Wei Zhou, Xiongwei Zhu, Zelin Xu, et al.

Autodata: An agentic data scientist to create high quality synthetic data

Supervised Fine-Tuning

Ilia Kulikov, Chenxi Whitehouse, Tianhao Wu, et al.

Improved Large Language Diffusion Models

Diffusion Model

Text Generation

Shen Nie, Qiyang Min, Shaoxuan Xu, et al.

How Robust is OCR-Reasoning? Evaluating OCR-Reasoning Robustness of Vision-Language Models under Visual Perturbations

Document Understanding

Yuxing Cheng, Yuan Wu, Yi Chang

RoboAtlas: Contextual Active SLAM

3D Machine Vision

Alexander Schperberg, Shivam K. Panda, Abraham P. Vinod, et al.

Learning Robot Visual Navigation in Crowds via Intention-Aware Scene Representations

Action Recognition

Han Bao, Bingyi Xia, Hanjing Ye, et al.

Deep Reinforcement Learning-Enhanced Event-Triggered Data-Driven Predictive Control for a 3D Cable-Driven Soft Robotic Arm

Reinforcement Learning

Cheng Ouyang, Moeen Ul Islam, Kaixiang Zhang, et al.

Natural Ungrokking: Asymmetric Control of Which Rules Survive Pretraining

Juliana Li, Diya Sreedhar

Every Nonnegative Integer Is a Sum of a Triangular, a Pentagonal, and a Heptagonal Number

Yichuan Cao, Dakai Guo, Ruichen Qiu, et al.

Loop Engineering: The Anthropic Playbook for Designing Systems That Prompt Your Agents

Peter Steinberger, Boris Cherny, Addy Osmani

Small LLMs: Pruning vs. Training from Scratch

Yufeng Xu, Taiming Lu, Kunjun Li, et al.

OpenThoughts-Agent: Data Recipes for Agentic Models

Negin Raoof, Richard Zhuang, Marianna Nezhurina, et al.

LingxiDiagBench: A Multi-Agent Framework for Benchmarking LLMs in Chinese Psychiatric Consultation and Diagnosis

Shihao Xu, Tiancheng Zhou, Jiatong Ma, et al.

AOHP: An Open-Source OS-Level Agent Harness for Personalized, Efficient and Secure Interaction

Shanhui Zhao, Jiacheng Liu, Guohong Liu, et al.

MemGUI-Agent: An End-to-End Long-Horizon Mobile GUI Agent with Proactive Context Management

Guangyi Liu, Gao Wu, Congxiao Liu, et al.

MobileForge: Annotation-Free Adaptation for Mobile GUI Agents with Hierarchical Feedback-Guided Policy Optimization

Guangyi Liu, Pengxiang Zhao, Gao Wu, et al.

NatureBench: Can Coding Agents Match the Published SOTA of Nature-Family Papers?

Yuru Wang, Lejun Cheng, Yuxin Zuo, et al.

Qwen-AgentWorld: Language World Models for General Agents

Yuxin Zuo, Zikai Xiao, Li Sheng, et al.

Rethinking Training Targets, Architectures and Data Quality for Universal Speech Enhancement

Audio and Speech Processing

Szu-Wei Fu, Rong Chao, Xuesong Yang, et al.

Generative 3D Gaussians with Learned Density Control

Diffusion Model

Runjie Yan, Yan-Pei Cao, Peng Wang, et al.

TADA: A Generative Framework for Speech Modeling via Text-Acoustic Dual Alignment

Trung Dang, Sharath Rao, Ananya Gupta, et al.

Beyond Isolated Words: Diffusion Brush for Handwritten Text-Line Generation

Diffusion Model

Image Generation

Gang Dai, Yifan Zhang, Yutao Qin, et al.

gsplat: An Open-Source Library for Gaussian Splatting

Vickie Ye, Ruilong Li, Justin Kerr, et al.

OmniVideo-100K: A Dataset for Audio-Visual Reasoning through Structured Scripts and Evidence Chains

Video Understanding

Visual Question Answering

Xinyue Cai, Chaoyou Fu, Yi-Fan Zhang, et al.

OPEN-SWE-TRACES: Advancing Dual-Mode Multilingual Distillation for Software Engineering Agents

Code Generation

Text Generation

Wasi Uddin Ahmad, Nikolai Ludwig, Somshubra Majumdar, et al.

Credit Assignment with Resets in Language Model Reasoning

Reinforcement Learning

Ankur Samanta, Akshayaa Magesh, Ayush Jain, et al.

The Verification Horizon: No Silver Bullet for Coding Agent Rewards

Qwen-Image-Agent: Bridging the Context Gap in Real-World Image Generation

OPID: On-Policy Skill Distillation for Agentic Reinforcement Learning

In-Context World Modeling for Robotic Control

DanceOPD: On-Policy Generative Field Distillation

Autodata: An agentic data scientist to create high quality synthetic data

Improved Large Language Diffusion Models

How Robust is OCR-Reasoning? Evaluating OCR-Reasoning Robustness of Vision-Language Models under Visual Perturbations

RoboAtlas: Contextual Active SLAM

Learning Robot Visual Navigation in Crowds via Intention-Aware Scene Representations

Deep Reinforcement Learning-Enhanced Event-Triggered Data-Driven Predictive Control for a 3D Cable-Driven Soft Robotic Arm

Natural Ungrokking: Asymmetric Control of Which Rules Survive Pretraining

Every Nonnegative Integer Is a Sum of a Triangular, a Pentagonal, and a Heptagonal Number

Loop Engineering: The Anthropic Playbook for Designing Systems That Prompt Your Agents

Small LLMs: Pruning vs. Training from Scratch

OpenThoughts-Agent: Data Recipes for Agentic Models

LingxiDiagBench: A Multi-Agent Framework for Benchmarking LLMs in Chinese Psychiatric Consultation and Diagnosis

AOHP: An Open-Source OS-Level Agent Harness for Personalized, Efficient and Secure Interaction

MemGUI-Agent: An End-to-End Long-Horizon Mobile GUI Agent with Proactive Context Management

MobileForge: Annotation-Free Adaptation for Mobile GUI Agents with Hierarchical Feedback-Guided Policy Optimization

NatureBench: Can Coding Agents Match the Published SOTA of Nature-Family Papers?

Qwen-AgentWorld: Language World Models for General Agents

Rethinking Training Targets, Architectures and Data Quality for Universal Speech Enhancement

Generative 3D Gaussians with Learned Density Control

TADA: A Generative Framework for Speech Modeling via Text-Acoustic Dual Alignment

Beyond Isolated Words: Diffusion Brush for Handwritten Text-Line Generation

gsplat: An Open-Source Library for Gaussian Splatting

OmniVideo-100K: A Dataset for Audio-Visual Reasoning through Structured Scripts and Evidence Chains

OPEN-SWE-TRACES: Advancing Dual-Mode Multilingual Distillation for Software Engineering Agents

Credit Assignment with Resets in Language Model Reasoning

The Verification Horizon: No Silver Bullet for Coding Agent Rewards

Qwen-Image-Agent: Bridging the Context Gap in Real-World Image Generation

OPID: On-Policy Skill Distillation for Agentic Reinforcement Learning

In-Context World Modeling for Robotic Control

DanceOPD: On-Policy Generative Field Distillation

Autodata: An agentic data scientist to create high quality synthetic data

Improved Large Language Diffusion Models

How Robust is OCR-Reasoning? Evaluating OCR-Reasoning Robustness of Vision-Language Models under Visual Perturbations

RoboAtlas: Contextual Active SLAM

Learning Robot Visual Navigation in Crowds via Intention-Aware Scene Representations

Deep Reinforcement Learning-Enhanced Event-Triggered Data-Driven Predictive Control for a 3D Cable-Driven Soft Robotic Arm

Natural Ungrokking: Asymmetric Control of Which Rules Survive Pretraining

Every Nonnegative Integer Is a Sum of a Triangular, a Pentagonal, and a Heptagonal Number

Loop Engineering: The Anthropic Playbook for Designing Systems That Prompt Your Agents

Small LLMs: Pruning vs. Training from Scratch

OpenThoughts-Agent: Data Recipes for Agentic Models

LingxiDiagBench: A Multi-Agent Framework for Benchmarking LLMs in Chinese Psychiatric Consultation and Diagnosis

AOHP: An Open-Source OS-Level Agent Harness for Personalized, Efficient and Secure Interaction

MemGUI-Agent: An End-to-End Long-Horizon Mobile GUI Agent with Proactive Context Management

MobileForge: Annotation-Free Adaptation for Mobile GUI Agents with Hierarchical Feedback-Guided Policy Optimization

NatureBench: Can Coding Agents Match the Published SOTA of Nature-Family Papers?

Qwen-AgentWorld: Language World Models for General Agents

Rethinking Training Targets, Architectures and Data Quality for Universal Speech Enhancement

Generative 3D Gaussians with Learned Density Control

TADA: A Generative Framework for Speech Modeling via Text-Acoustic Dual Alignment

Beyond Isolated Words: Diffusion Brush for Handwritten Text-Line Generation

gsplat: An Open-Source Library for Gaussian Splatting

OmniVideo-100K: A Dataset for Audio-Visual Reasoning through Structured Scripts and Evidence Chains

OPEN-SWE-TRACES: Advancing Dual-Mode Multilingual Distillation for Software Engineering Agents

Credit Assignment with Resets in Language Model Reasoning