HyperAI

Neueste Forschungsarbeiten

Täglich aktualisierte wegweisende KI-Forschungsarbeiten, um mit den neuesten KI-Trends Schritt zu halten

Astra: Toward General-Purpose Mobile Robots via Hierarchical Multimodal
  Learning
Astra: Toward General-Purpose Mobile Robots via Hierarchical Multimodal Learning
Sheng Chen, Peiyu He, Jiaxin Hu, et al.
Veröffentlichungsdatum: 6/10/2025
SpatialLM: Training Large Language Models for Structured Indoor Modeling
SpatialLM: Training Large Language Models for Structured Indoor Modeling
Yongsen Mao, Junhao Zhong, Chuan Fang, et al.
Veröffentlichungsdatum: 6/10/2025
OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation
OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation
Jingjing Chang, Yixiao Fang, Peng Xing, et al.
Veröffentlichungsdatum: 6/10/2025
Proactive Assistant Dialogue Generation from Streaming Egocentric Videos
Proactive Assistant Dialogue Generation from Streaming Egocentric Videos
Yichi Zhang, Xin Luna Dong, Zhaojiang Lin, et al.
Veröffentlichungsdatum: 6/10/2025
PersonaAgent: When Large Language Model Agents Meet Personalization at Test Time
PersonaAgent: When Large Language Model Agents Meet Personalization at Test Time
Weizhi Zhang, Xinyang Zhang, Chenwei Zhang, et al.
Veröffentlichungsdatum: 6/10/2025
Audio-Aware Large Language Models as Judges for Speaking Styles
Audio-Aware Large Language Models as Judges for Speaking Styles
Cheng-Han Chiang, Xiaofei Wang, Chung-Ching Lin, et al.
Veröffentlichungsdatum: 6/9/2025
MORSE-500: A Programmatically Controllable Video Benchmark to
  Stress-Test Multimodal Reasoning
MORSE-500: A Programmatically Controllable Video Benchmark to Stress-Test Multimodal Reasoning
Zikui Cai, Andrew Wang, Anirudh Satheesh, et al.
Veröffentlichungsdatum: 6/9/2025
Leveraging Self-Attention for Input-Dependent Soft Prompting in LLMs
Leveraging Self-Attention for Input-Dependent Soft Prompting in LLMs
Ananth Muppidi, Abhilash Nandy, Sambaran Bandyopadhyay
Veröffentlichungsdatum: 6/9/2025
Is Extending Modality The Right Path Towards Omni-Modality?
Is Extending Modality The Right Path Towards Omni-Modality?
Tinghui Zhu, Kai Zhang, Muhao Chen, et al.
Veröffentlichungsdatum: 6/9/2025
FusionAudio-1.2M: Towards Fine-grained Audio Captioning with Multimodal
  Contextual Fusion
FusionAudio-1.2M: Towards Fine-grained Audio Captioning with Multimodal Contextual Fusion
Shunian Chen, Xinyuan Xie, Zheshu Chen, et al.
Veröffentlichungsdatum: 6/9/2025