HyperAIHyperAI

Command Palette

Search for a command to run...

The AI4S research team led by Tian Yonghong and Chen Jie from the School of Information Engineering at Shenzhen Graduate School has made new progress that has been published in Nature Machine Intelligence.

### Abstract In a significant advancement in the field of Artificial Intelligence for Science (AI4S), a research team from the Department of Information Engineering at Shenzhen Graduate School, led by Professors Tian Yonghong and Chen Jie, has published their latest findings in *Nature Machine Intelligence*. This development comes at a time when AI's impact on scientific research and human cognition has been highlighted by the 2024 Nobel Prizes in Physics and Chemistry, both awarded to AI-related research. The team's work, which has previously achieved notable recognition, including a finalist position in the 2022 Gordon Bell Special Prize and several prestigious awards in 2023 and 2024, underscores China's leading role in computational clusters and scientific innovation. The research focuses on the development of a customized protein language model tailored for evolutionary prediction tasks. The team introduced a novel pre-training strategy and dataset, providing fresh insights into the balance between pre-training and downstream tasks in protein language models. By addressing the two fundamental aspects of viral evolution—"few-site mutations" and "rare beneficial mutations"—the study proposes a universal prediction model that can be applied across various virus types and strains, including SARS-CoV-2, influenza, Zika, and HIV. This model, named E2VD (Evolution-driven Virus Variant Dynamics), integrates evolutionary theory with advanced AI architectures to predict viral mutations accurately and efficiently. E2VD's framework is built on three key components: a customized large language model for proteins, a comprehensive module for reconstructing the interaction network of mutations, and a multi-task focal loss function. The large language model, trained on the domestic E-level intelligent computing platform "Pengcheng Cloud Brain II" using 256 NPU cards, is designed to capture subtle changes in molecular interactions caused by "few-site mutations." The interaction network module, which includes a dynamic granularity attention mechanism to identify motif patterns, addresses the challenge of predicting "rare beneficial mutations" despite the severe imbalance in positive and negative samples. The multi-task focal loss function further enhances the model's performance by focusing on the most critical tasks. The results of the study demonstrate that E2VD outperforms existing methods by 7% to 21% in predicting key viral evolutionary drivers. Notably, the model's accuracy in predicting rare beneficial mutations has improved from 13% to 80%, marking a significant leap in precision. E2VD's ability to predict evolutionary trends at different scales, from within a single pandemic to broader macro-evolutionary trajectories, provides a robust theoretical foundation for understanding viral evolution in the real world. The research also highlights E2VD's strong generalization capabilities across different virus types and strains. The team developed a robust metric to assess the changes in viral fitness caused by mutations, ensuring that the model's performance is consistent and superior across various viral species. This metric was validated across SARS-CoV-2, Zika, influenza, and HIV, demonstrating E2VD's potential for broader applications in the study of infectious diseases. The implications of this research are far-reaching. E2VD can be integrated into vaccine and protein-based drug design processes, potentially enhancing their efficiency and controllability. By providing a powerful tool for the rapid and proactive updating of vaccines and drugs, the model could significantly improve human responses to emerging viral infections. Furthermore, the study supports and accelerates the exploration of complex evolutionary mechanisms, contributing to a deeper understanding of how species diversity and protein function are interconnected. The paper's lead authors are doctoral student Nie Zhiwei and master's student Liu Xudong from the Department of Information Engineering at Peking University, with Professors Tian Yonghong and Chen Jie serving as the corresponding authors. This collaborative effort between Shenzhen Graduate School and the Guangzhou National Laboratory, represented by researcher Zhou Peng, exemplifies the interdisciplinary nature of AI4S research and its potential to revolutionize scientific methodologies. ### Key Points: - **Publication and Recognition**: The research by Professors Tian Yonghong and Chen Jie's team from Shenzhen Graduate School is published in *Nature Machine Intelligence* and has received multiple awards, including the 2022 Gordon Bell Special Prize finalist position and the 2023 Guangdong Science and Technology Award. - **Research Focus**: The team developed a customized protein language model, E2VD, for evolutionary prediction, addressing "few-site mutations" and "rare beneficial mutations" in viral evolution. - **Model Components**: E2VD integrates a large language model, an interaction network module with dynamic granularity attention, and a multi-task focal loss function. - **Performance**: E2VD outperforms existing methods by 7% to 21% in predicting viral evolutionary drivers and significantly improves the accuracy of rare beneficial mutation predictions from 13% to 80%. - **Generalization**: The model shows strong generalization capabilities across different virus types and strains, validated across SARS-CoV-2, Zika, influenza, and HIV. - **Implications**: E2VD can be used to enhance vaccine and drug design processes, improve responses to emerging viral infections, and deepen the understanding of complex evolutionary mechanisms. ### Conclusion This groundbreaking research by the Shenzhen Graduate School team marks a significant step forward in AI4S, offering a powerful tool for predicting viral evolution and supporting the development of more effective medical interventions. The interdisciplinary approach and the model's robust performance highlight the potential of AI to transform scientific methodologies and advance our understanding of biological systems.

Related Links

The AI4S research team led by Tian Yonghong and Chen Jie from the School of Information Engineering at Shenzhen Graduate School has made new progress that has been published in Nature Machine Intelligence. | Trending Stories | HyperAI