Shanghai Jiao Tong University AI Breakthroughs Featured at Leading Conferences in 2025
Recent breakthroughs in artificial intelligence research from multiple teams at the School of Natural Sciences, Shanghai Jiao Tong University, have been recognized by top-tier international conferences and journals in 2025. The achievements span foundational theories, algorithmic innovation, and practical applications, showcasing the institute’s strong academic leadership and research vitality. Over 20 papers have been accepted or published in leading venues including Journal of Machine Learning Research (JMLR), International Conference on Machine Learning (ICML), Neural Information Processing Systems (NeurIPS), International Conference on Learning Representations (ICLR), Association for Computational Linguistics (ACL), and Neural Networks. The research portfolio covers key areas from theoretical foundations to real-world applications, reflecting the institute’s comprehensive and integrated approach to AI development. Notable results include 1 paper in JMLR, 4 in ICML (including 1 Spotlight), 7 in NeurIPS (including 1 Oral and 1 Spotlight), 1 in ICLR, 1 in ACL, and 1 in Neural Networks. I. Foundational AI Theory (by lead researcher's surname) Luo Tao’s Team Luo’s team developed a gradient flow framework to analyze the training dynamics of linearized Transformers, revealing a two-phase behavior: in the first phase, asymmetric perturbations from random initialization maintain non-degenerate gradients and drive parameter alignment; in the second phase, previously static K–Q attention matrices become dominant, inducing asymptotic rank collapse. This work establishes a theoretical link between directional convergence and rank collapse in Transformers. The paper was accepted as an Oral presentation at NeurIPS 2025, with Chen Zheng’an as first author and Luo Tao as corresponding author. In another theoretical advance, Zhang Yaoyu and Luo Tao demonstrated the KKT embedding principle: small networks’ maximum-margin solutions can be embedded into larger networks’ KKT points through neuron splitting. The result holds for both two-layer and deep networks and extends to gradient flow dynamics, offering new insights into the emergence of neural network convergence. This work was accepted as a Poster at NeurIPS 2025, with Zhang Jiahao as first author and Zhang Yaoyu and Luo Tao as corresponding authors. Additionally, in the AI for Mathematics (AI4Math) domain, Luo’s team introduced ATLAS, a data generation framework addressing the scarcity of parallel corpora in formal mathematics. ATLAS uses three stages—data boosting, synthesis, and augmentation—to generate a large-scale, high-quality dataset of 117,000 theorem statements from the Mathlib knowledge base. The ATLAS translator trained on this data achieved state-of-the-art performance across benchmarks. The work was accepted as a Poster at NeurIPS 2025, with Liu Xiaoyang as first author and Luo Tao as corresponding author. Min Hancheng’s Team Min’s team made significant progress in robust learning under isotropic Gaussian mixture models. They proved that robust classifiers with provable adversarial resilience can be trained without additional defense mechanisms. The team derived the theoretical upper bound on the l2-norm attack that any classifier can withstand while maintaining high accuracy, and showed that a polynomial ReLU network trained via gradient flow can approximate the optimal robust classifier—without relying on adversarial training data. This work was published in ICML 2025, providing a new theoretical foundation for understanding and improving adversarial robustness. In the study of Neural Collapse, the team rigorously proved that gradient flow optimization in single-hidden-layer ReLU networks naturally leads to neural collapse under specific classification tasks. This breaks from prior assumptions of unconstrained feature spaces, revealing the critical roles of data structure and non-linear activation in shaping collapse dynamics. The findings highlight the implicit bias in training dynamics that drives neural collapse. The paper was accepted at NeurIPS 2025 and offers deeper theoretical insights into deep network training. Xu Zhiqin’s Team Xu’s team investigated the impact of initialization on Transformer inference capabilities using anchor functions. They found that under small initialization, the model’s parameter space favors inference-related structures, and embedding representations become more adept at capturing structural patterns in inference data—leading to faster fitting. These findings were validated in large language models and contributed significantly to understanding inference mechanisms. The work was accepted at ICML 2025 and selected as a Spotlight paper, with Yao Junjie as first author and Xu Zhiqin and Zhang Zhongwang as co-corresponding authors. The team further extended their analysis to compare Mamba and Transformer models, uncovering a fundamental difference: Mamba struggles with symmetric patterns. Experiments identified structural causes behind this limitation, offering valuable guidance for future model design. The paper was accepted as a Spotlight at NeurIPS 2025, with Chen Tianyi and Lin Pengxiao as co-first authors and Xu Zhiqin as corresponding author. The area chair praised the work for its simple yet effective synthetic tasks and practical improvements like residual paths and gating mechanisms. Zhang Yaoyu’s Team Zhang’s team proposed a novel method called “optimistic estimation” to determine the minimum sample size required for model recovery under general settings, based on local linear recovery. They proved that the optimistic sample complexity is bounded by the parameter count of the smallest network capable of representing the target function. The study also showed that increasing network width improves sample efficiency, while redundant connections reduce it. The team introduced the concept of “model rank” to quantify parameter convergence and generalized it to nonlinear parameterized models. This work was published in JMLR, with Zhang Yaoyu as first and corresponding author. II. AI Algorithms (by lead researcher's surname) Wang Yuguang’s Team The team developed a novel hypergraph message-passing framework inspired by interacting particle systems. They modeled hyperedges as “fields” that induce node dynamics, incorporating attractive and repulsive forces along with Allen-Cahn forcing terms to achieve class-dependent balance. By using first- and second-order particle system equations, the method effectively mitigates over-smoothing and heterogeneity. Introducing stochastic components captures interaction uncertainty. Theoretical analysis ensures a positive lower bound on hypergraph Dirichlet energy, enabling deeper message passing. The model demonstrated strong performance across multiple real-world datasets. The work was published in NeurIPS 2025, offering a new paradigm for hypergraph learning and complex relational modeling. Wu Hao’s Team Addressing the inconsistency between sample distribution and energy functions in diffusion models for molecular equilibrium, Wu’s team proposed a Fokker–Planck regularized diffusion model. This approach significantly improves alignment between generated samples and underlying energy landscapes. Based on this framework, they developed a transferable Boltzmann generator for small molecules, enabling efficient sampling of coarse-grained systems and simultaneous construction of precise energy models. The method was validated in NeurIPS 2025, with Wu Hao as co-corresponding author. Zhang Xiaoqun’s Team The team tackled low-multilinear-rank tensor completion by introducing a novel preconditioned Riemannian metric based on fixed-rank tensor manifold structure. They designed an efficient preconditioned Riemannian gradient descent algorithm that converges faster than existing methods while maintaining the same per-iteration computational cost. Theoretical analysis shows recovery guarantees under near-optimal sampling complexity. Experiments on synthetic and real-world video restoration tasks demonstrate superior performance. The work was published in ICML 2025, providing a powerful tool for high-dimensional data analysis. III. AI Applications Zhou Bingxin’s Team The team introduced VenusVaccine, a deep learning-based tool for precise immunogenicity prediction. By integrating dual attention mechanisms to capture both protein sequence and structural information, and leveraging the most comprehensive immunogenicity dataset to date—containing over 7,000 antigens from bacteria, viruses, and tumors—VenusVaccine outperformed existing methods across multiple metrics. It also demonstrated practical value through post-hoc validation in real vaccine design. The work was published in ICLR 2025. Additionally, the team launched VenusFactory, an open, unified platform for protein engineering with both GUI and command-line interfaces. It enables no-code workflows for mutation prediction, functional analysis, residue-level inference, data retrieval, model training, evaluation, and deployment. The platform has achieved over 100,000 monthly downloads of models and datasets. This work was published in ACL 2025. IV. Editorial Leadership Notably, Associate Professor Liang Jingwei served as Area Chair for both ICML 2025 and NeurIPS 2025 (two consecutive years), while Associate Professor Min Hancheng will serve as Area Chair for ICLR 2026 and AISTATS 2026. Associate Professor Liu Lin previously served as Area Chair for CLeaR 2023 and 2024. Associate Professor Wang Yuguang co-chaired the 2024 LoG Conference on Graph Machine Learning. These leadership roles reflect the institute’s growing influence in shaping the global AI research agenda. The sustained output of high-impact research underscores Shanghai Jiao Tong University’s commitment to advancing AI from theory to real-world innovation. The institute continues to drive frontier interdisciplinary research, aiming to deliver lasting contributions to the global AI community. For collaboration and further information, visit the research group websites: Li Jingwei: https://jliang993.github.io/ Liu Lin: https://linliu-stats.github.io/ Luo Tao: https://math.sjtu.edu.cn/Default/teachershow/tags/MDAwMDAwMDAwMLKIet0%E3%80%91 Min Hancheng: https://hanchmin.github.io/ Wang Yuguang: https://yuguangwang.github.io/ Wu Hao: https://ins.sjtu.edu.cn/peoples/wuhao Xu Zhiqin: https://ins.sjtu.edu.cn/people/xuzhiqin/ Zhang Xiaoqun: https://math.sjtu.edu.cn/faculty/xqzhang/ Zhang Yaoyu: https://yaoyuzhang1.github.io/ Zhou Bingxin: https://ins.sjtu.edu.cn/peoples/ZhouBingxin