HyperAIHyperAI
Back to Headlines

RUC High-Qiang AI Team Has 39 Papers Accepted by NeurIPS 2025

a month ago

Recent results from NeurIPS 2025, one of the top-tier international academic conferences in machine learning (alongside ICML and ICLR), have been released. The Center for Artificial Intelligence at Renmin University of China, known as the HuiLing AI Institute, has achieved a remarkable milestone with 39 papers accepted—37 in the Main Track (including 2 Oral presentations) and 2 in the Datasets and Benchmarks Track. The 39th NeurIPS conference is scheduled to take place from December 2 to 7, 2025, in both San Diego, USA, and Mexico City, Mexico. Among the accepted works, several stand out for their innovative contributions across diverse AI domains: MokA: Multimodal Low-Rank Adaptation for MLLMs (Oral) Authors: Wei Yake, Miao Yu, Zhou Dongzhan, Hu Di This paper proposes MokA, a novel low-rank adaptation framework for multimodal large language models (MLLMs). It addresses the limitation of simply transferring unimodal fine-tuning strategies to multimodal settings by explicitly modeling both inter-modal interactions and intra-modal characteristics. MokA achieves superior performance across multiple multimodal benchmarks while maintaining efficiency. Large Language Diffusion Models (Oral) Authors: Nie Shen, Zhu Fengqi, You Zebin, Zhang Xiaolu, Ou Jingyang, Hu Jun, Zhou Jun, Lin Yankai, Wen Jirong, Li Chongxuan This work introduces LLaDA, a diffusion model trained from scratch under a pretraining and supervised fine-tuning paradigm. Unlike traditional autoregressive models, LLaDA uses a forward masking process and a reverse generation mechanism with Transformer-based modeling. It demonstrates strong scalability and competitive performance against autoregressive baselines on general, mathematical, and code tasks. Notably, LLaDA-8B matches LLaMA3-8B in few-shot learning and shows strong instruction-following ability after fine-tuning. It also outperforms GPT-4o in the poem completion reversal task, challenging the assumption that language modeling fundamentally requires autoregression. Think or Not? Exploring Thinking Efficiency in Large Reasoning Models via an Information-Theoretic Lens Authors: Yong Xixian, Zhou Xiao, Zhang Yingying, Li Jinlin, Zheng Yefeng, Wu Xian This study analyzes reasoning efficiency in large reasoning models using information theory. It introduces two metrics—InfoBias and InfoGain—to measure deviation from ideal reasoning paths and incremental information gain. The findings reveal that longer chains often have higher bias and diminishing gains, especially in incorrect answers. The authors propose Adaptive Think, a dynamic stopping strategy based on entropy, which reduces token usage by 50.8% while improving average accuracy by 1.10% on QwQ-32B across six diverse reasoning tasks. Beyond Last-Click: An Optimal Mechanism for Ad Attribution Authors: An Nan, Li Weian, Qi Qi, Yu Changyuan, Zhang Liang This paper critiques the widely used Last-Click Mechanism (LCM) for ad attribution, showing it fails to satisfy incentive compatibility and performs poorly in heterogeneous environments. The authors propose Peer-Validated Mechanism (PVM), which relies on cross-platform reports and prior probabilities, making it immune to self-reporting bias. Theoretical and empirical results show PVM achieves optimal accuracy in homogeneous settings and strong fairness guarantees in heterogeneous ones. Universally Invariant Learning in Equivariant GNNs Authors: Cen Jia-cheng, Li Anyi, Lin Ning, Xu Tingyang, Rong Yu, Zhao Deli, Wang Zihao, Huang Wenbing The authors present a theoretically grounded framework for constructing complete equivariant graph neural networks (GNNs). They prove that two components—geometric standard form and full-rank controllable basis—are sufficient for completeness. The proposed method enables efficient construction using common models like EGNN and TFN, achieving strong performance with minimal layers and reduced computational cost. FlexWorld: Progressively Expanding 3D Scenes for Flexible-View Exploration Authors: Chen Luxi, Zhou Zihan, Zhao Min, Wang Yikai, Zhang Ge, Huang Wenhao, Sun Hao, Wen Jirong, Li Chongxuan FlexWorld generates flexible-view 3D scenes from a single image using a fine-tuned video-to-video diffusion model and a progressive generation process. It supports 360-degree rotation and smooth navigation, demonstrating strong visual quality and flexibility in generating immersive 3D content. Beyond Single-Point Judgment: Distribution Alignment for LLM-as-a-Judge Authors: Chen Luyu, Zhang Zeyu, Tan Haoran, Dai Quanyu, Yang Hao, Dong Zhenhua, Chen Xu This paper challenges the common practice of single-score evaluation by proposing a distribution alignment framework. It aligns LLM-generated judgment distributions with human distributions using KL divergence and cross-entropy regularization, enhanced with adversarial training for robustness. The method significantly outperforms both closed-source models and single-point baselines. Masked Diffusion Models as Energy Minimization Authors: Chen Sitong, Nie Shen, Sun Jiacheng, Feng Zijin, Li Zhen Guo, Wen Jirong, Li Chongxuan The authors establish a theoretical framework linking masked diffusion models (MDMs) to energy minimization in discrete optimal transport. They prove equivalence among kinetic, conditional kinetic, and geodesic energy under optimal scheduling. A Beta-distribution-based parameterization enables efficient post-training tuning, improving sampling quality—especially at low step counts. Improving Retrieval-Augmented Generation through Multi-Agent Reinforcement Learning Authors: Chen Yiqun, Yan Lingyong, Sun Weiwei, Ma Xinyu, Zhang Yi, Wang Shuaqiang, Yin Dawei, Yiming Yang, Mao Jiaxin The paper proposes MMOA-RAG, a multi-agent reinforcement learning framework for joint optimization of RAG modules. By modeling modules as cooperative agents and using global answer quality as reward, the method achieves significant performance gains over baselines on multiple QA datasets. Incentivizing Dual Process Thinking for Efficient Large Language Model Reasoning Authors: Cheng Xiaoxue, Li Junyi, Zhang Zhenduo, Tang Xinyu, Zhao Xin, Kong Xinyu, Zhang Zhiqiang Inspired by cognitive dual-process theory, the authors propose ACPO, a reinforcement learning framework that enables adaptive cognitive switching. It uses explicit “system tokens” and online difficulty estimation to reduce overthinking in simple tasks while maintaining accuracy on complex ones. UniGist: Towards General and Hardware-aligned Sequence-level Long Context Compression Authors: Deng Chenlong, Zhang Zhisong, Mao Kelong, Li Shuaiyi, Fang Tianqing, Zhang Hongming, Mi Haitao, Yu Dong, Dou Zhicheng UniGist introduces a Gist Token-based framework for long-context compression. It eliminates the need for chunked training and uses a sparse attention kernel with Gist Shift to improve efficiency. The model achieves superior detail recall and long-range dependency modeling across multiple tasks. Domain Specific Pruning of Large Mixture-of-Experts Models with Few-shot Demonstrations Authors: Dong Zican, Peng Han, Liu Peiyu, Zhao Xin, Wu Dong, Xiao Feng, Wang Zhifeng The authors discover that only a few domain-specific examples are needed to activate a sparse set of experts in MoE models. They propose EASY-EP, a pruning framework using output-aware expert importance and token contribution estimation. On DeepSeek-R1 and DeepSeek-V3, it achieves comparable performance with half the experts. Generalizing Experience for Language Agents with Hierarchical MetaFlows Authors: Fan Shengda, Cong Xin, Zhang Zhong, Fu Yuepeng, Wu Yese, Wang Hao, Zhang Xinyu, Hu Enrui, Lin Yankai MetaFlowLLM builds a hierarchical experience tree from past tasks, enabling agents to retrieve and reuse relevant MetaFlows. It boosts performance on AppWorld and WorkBench by 32.3% and 6.2% respectively, reducing execution cost. Learning to Focus: Causal Attention Distillation via Gradient-Guided Token Pruning Authors: Guo Yiju, Yang Wenkai, Sun Zexu, Ding Ning, Liu Zhiyuan, Lin Yankai The paper identifies that attention drift due to confounding tokens harms reasoning. LeaF uses gradient contrast to detect and remove such tokens, then distills causal attention patterns. It improves performance by 2–4% across math, code, and RAG tasks. Geometric Mixture Models for Electrolyte Conductivity Prediction Authors: Li Anyi, Cen Jia-cheng, Li Songyou, Li Mingze, Yu Yang, Huang Wenbing The authors propose GeoMix, a geometric-aware framework for predicting electrolyte conductivity. It uses molecular geometric graphs and equivariant interaction networks, outperforming baselines on two benchmark datasets. CAM: A Constructivist View of Agentic Memory for LLM-Based Reading Comprehension Authors: Li Rui, Dai Quanyu, Zhang Zeyu, Bo Xiaohe, Tian Zihang, Chen Xu, Dong Zhenhua, Tang Ruiming CAM applies Piaget’s constructivism to memory design, introducing structured schemata, flexible assimilation, and dynamic accommodation. It uses incremental overlapping clustering to build a memory system that supports efficient, context-aware retrieval. WebThinker: Empowering Large Reasoning Models with Deep Research Capability Authors: Li Xiaoxi, Jin Jiajie, Dong Guanting, Qian Hongjin, Zhu Yutao, Wu Yongkang, Wen Jirong, Dou Zhicheng WebThinker enables LLMs to conduct end-to-end research by combining autonomous thinking, web search, and writing. It uses reinforcement learning to optimize tool usage and outperforms existing systems on complex reasoning and report generation tasks. Demystifying Reasoning Dynamics with Mutual Information: Thinking Tokens are Information Peaks in LLM Reasoning Authors: Qian Chen, Liu Dongrui, Wen Haochen, Bai Zhen, Liu Yong, Shao Jing This study finds that mutual information between representation and answer spikes at specific “thinking tokens” like “Hmm” or “Therefore.” These tokens are critical for reasoning, and the paper proposes methods to leverage them for improved performance. Robotic Policy Learning via Human-assisted Action Preference Optimization Authors: Xia Wenke, Yang Yichu, Wu Hongtao, Ma Xiao, Kong Tao, Hu Di The paper introduces a human-in-the-loop framework where VLA models learn from minimal human corrections during deployment, enabling fast adaptation to fine-grained robotic tasks. Towards Thinking-Optimal Scaling of Test-Time Compute for LLM Reasoning Authors: Yang Wenkai, Ma Shuming, Lin Yankai, Wei Furu The authors discover that excessively long reasoning chains can hurt performance. They propose a strategy to select the shortest correct answer from multiple reasoning paths, achieving better efficiency. Future Link Prediction Without Memory or Aggregation Authors: Yi Lu, Lei Runlin, Mo Fengran, Zheng Yanping, Wei Zhewei, Ye Yuhang CRAFT, a novel method based on cross-attention, avoids memory and aggregation modules. It achieves strong performance on real-world temporal graph datasets. MoORE: SVD-based Model MoE-ization for Conflict- and Oblivion-Resistant Multi-Task Adaptation Authors: Yuan Shen, Zheng Yin, Wang Taifeng, Liu Binbin, Xu Hongteng MoORE uses SVD decomposition and orthogonal experts to enable conflict-free and forgetting-resistant multi-task adaptation. It outperforms existing methods on multiple benchmarks. MemSim: A Bayesian Simulator for Evaluating Memory of LLM-based Personal Assistants Authors: Zhang Zeyu, Dai Quanyu, Chen Luyu, Jiang Zeren, Li Rui, Zhu Jiemin, Chen Xu, Xie Yi, Dong Zhenhua, Wen Jirong MemSim uses Bayesian networks and causal generation to create reliable, diverse evaluation data for LLM memory systems. It introduces MemDaily, a new benchmark for memory evaluation. Scaling Diffusion Transformers Efficiently via μP Authors: Zheng Chenyu, Zhang Xinyu, Wang Rongzhen, Huang Wei, Tian Zhi, Huang Weilin, Zhu Jun, Li Chongxuan The paper extends μP (maximum update parameterization) to diffusion Transformers, enabling efficient scaling. It achieves 2.9× faster convergence and superior performance with drastically reduced tuning cost. GUI-G1: Understanding R1-Zero-Like Training for Visual Grounding in GUI Agents Authors: Zhou Yuqi, Dai Sunhao, Wang Shuai, Zhou Kaiwen, Jia Qinglin, Xu Jun The paper identifies key bottlenecks in R1-Zero training and proposes solutions including a fast thinking template, size-constrained reward, and difficulty-weighted optimization, achieving high accuracy with minimal data. Stability and Sharper Risk Bounds with Convergence Rate $1/n^2$ Authors: Zhu Bowei, Li Shaojie, Yi Mingyang, Liu Yong The paper proves a tighter generalization bound of $O(\log^2(n)/n^2)$ under standard assumptions, providing the tightest high-probability bound for non-convex gradient methods. Counterfactual Reasoning for Steerable Pluralistic Value Alignment in Large Language Models Authors: Guo Hanzhe, Yao Jing, Zhou Xiao, Yi Xiaoyuan, Xie Xing COUPLE uses structural causal models and counterfactual reasoning to enable fine-grained, controllable value alignment across diverse cultural contexts. MOF-BFN: Metal-Organic Frameworks Structure Prediction via Bayesian Flow Networks Authors: Jiao Rui, Wu Hanlin, Huang Wenbing, Song Yuxuan, Ouyang Yawen, Rong Yu, Xu Tingyang, Wang Pengju, Zhou Hao, Ma Weiying, Liu Jingjing, Liu Yang MOF-BFN uses Bayesian flow networks to predict MOF structures with high accuracy, incorporating periodicity and orientation modeling. Learning 3D Anisotropic Noise Distributions Improves Molecular Force Fields Authors: Liu Xixian, Jiao Rui, Liu Zhiyuan, Liu Yurou, Liu Yang, Lu Ziheng, Huang Wenbing, Zhang Yang, Cao Yixin AniDS introduces anisotropic noise modeling with structure-aware covariance, significantly improving force prediction accuracy. MF-LLM: Simulating Population Decision Dynamics via a Mean-Field Large Language Model Framework Authors: Mi Qirui, Yang Mengyue, Yu Xiangning, Zhao Ziyu, Deng Cheng, An Bo, Zhang Haifeng, Chen Xu, Wang Jun MF-LLM applies mean-field theory to LLM-based social simulation, enabling bidirectional individual-group interaction and improved alignment with real-world data. Irrational Complex Rotations Empower Low-bit Optimizers Authors: Tian Zhen, Zhao Xin, Wen Jirong π-Quant uses irrational rotations to compress optimizer states to ~3.32 bits, reducing memory by ~40% while maintaining or improving performance. A Generalized Iterative Imputation Framework for Model Adaptation and Oracle Feature Utilization Authors: Wang Hao, Li Zhengnan, Chen Zhichao, Chen Xu, He Shuting, Liu Guangyi, Li Haoxuan, Lin Zhouchen KPI introduces a two-layer framework with adaptive modeling and oracle feature utilization, outperforming existing imputation methods. TransDF: Time-Series Forecasting Needs Transformed Label Alignment Authors: Wang Hao, Pan Li’cheng, Chen Zhichao, Chen Xu, Dai Qingyang, Wang Lei, Li Haoxuan, Lin Zhouchen TransDF transforms labels to reduce autocorrelation and task count, achieving state-of-the-art performance. Latent Retrieval Augmented Generation of Cross-Domain Protein Binders Authors: Zhang Zishen, Kong Xiangzhe, Huang Wenbing, Liu Yang RADiAnce uses retrieval-augmented diffusion to generate protein binders with strong cross-domain transferability and improved interface accuracy. SSTAG: Structure-Aware Self-Supervised Learning Method for Text-Attributed Graphs Authors: Liu Ruyue, Yin Rong, Bo Xiangzhen, Hao Xiaoshuai, Liu Yong, Zhong Jinwen, Ma Can, Wang Weiping SSTAG combines LLM semantics with GNN structure via dual knowledge distillation and memory storage, enabling scalable and generalizable graph learning. HawkBench: Investigating Resilience of RAG Methods on Stratified Information-Seeking Tasks Authors: Qian Hongjin, Liu Zheng, Gao Chao, Wang Yankai, Lian Defu, Dou Zhicheng HawkBench is a multi-domain, multi-task benchmark with stratified evaluation, revealing critical gaps in RAG resilience and adaptability. Chain-of-Retrieval Augmented Generation Authors: Wang Liang, Chen Haonan, Yang Nan, Huang Xiaolong, Dou Zhicheng, Wei Furu CoRAG enables dynamic, multi-step retrieval during reasoning. Using rejection sampling to generate intermediate chains, it achieves >10% EM improvement on multi-hop QA. ICPC-Eval: Probing the Frontiers of LLM Reasoning with Competitive Programming Contests Authors: Xu Shiyi, Hu Yiweng, Min Yingqian, Chen Zhipeng, Zhao Xin, Wen Jirong ICPC-Eval is a new benchmark based on real ICPC competition problems, featuring realistic difficulty, robust test case generation, and the Refine@K metric to assess iterative reasoning. MomentSeeker: A Task-Oriented Benchmark For Long-Video Moment Retrieval Authors: Yuan Huaying, Ni Jian, Liu Zheng, Wang Yuezhe, Zhou Junjie, Liang Zhengyang, Zhao Bo, Cao Chao, Dou Zhicheng, Wen Jirong MomentSeeker is a large-scale, multi-domain benchmark for long-video moment retrieval with hierarchical evaluation and multi-modal queries. These 39 papers highlight the strong research momentum and technical depth of the HuiLing AI Institute at Renmin University, establishing its leadership in AI innovation across multiple cutting-edge domains.

Related Links

中国人民大学高瓴人工智能学院新闻公告