【ScienceAI Weekly】DeepMind's AI Pharmaceutical Company Split off Reaches a New $3 Billion Agreement; ByteDance Is Reportedly Recruiting Biology/chemistry/physics Talents in the United States

New achievements, new developments, and new perspectives of AI for Science——* AI pharmaceutical company spun off from DeepMind reaches first pharmaceutical cooperation worth $3 billion* Microsoft helps researchers discover 32 million new battery materials* TikTok is reportedly recruiting talents in computational biology, quantum chemistry, molecular dynamics and physics across the United States* iFlytek plans to spin off its medical business and list it on the main board of the Hong Kong Stock Exchange* Magnesium Chemical completes $26 million in Series A financing* Academic journal Science uses AI tools to detect fake images in journals
See below for details~
Company News
AI pharmaceutical company spun off from DeepMind reaches first pharmaceutical partnership
On January 8, Alphabet's AI pharmaceutical company Isomorphic Labs announced that it had reached two $3 billion drug development agreements with Eli Lilly and Novartis. The cooperation involves the development of treatments for multiple disease-related proteins and pathways. Isomorphic Labs was founded in November 2021 and was spun off from Google's DeepMind, focusing on the field of AI medicine. The company uses DeepMind's achievements in biomedical research, especially its protein structure prediction model AlphaFold, to conduct drug research and development.
Microsoft helps researchers discover 32 million new battery materials
It is reported that the Pacific Northwest National Laboratory used Microsoft's Azure Quantum Elements service to quickly evaluate 32 million potential new battery materials; the U.S. Department of Energy discovered 18 promising candidate materials after 80 hours of using the service, which might have taken decades using traditional research methods; British chemical manufacturer Johnson Matthey is using the service to accelerate the research and development of hydrogen fuel cells.
TikTok is reportedly recruiting talents in computational biology, quantum chemistry, molecular dynamics and physics across the United States
It is reported that TikTok's parent company ByteDance is recruiting talents in computational biology, quantum chemistry, molecular dynamics and physics for its artificial intelligence drug design and artificial intelligence science teams. It is reported that ByteDance is recruiting at least 17 related positions in New York, California and Washington.
iFlytek plans to spin off its medical business and list it on the main board of the Hong Kong Stock Exchange
On the evening of January 9, iFlytek announced that it plans to spin off its holding subsidiary iFlytek Medical to be listed on the main board of the Hong Kong Stock Exchange. The size of iFlytek Medical's offering will not exceed 15% of the total share capital after the issuance. After the spin-off is completed, iFlytek will still maintain control over iFlytek Medical.
iFlytek Medical was founded in May 2016. Based on world-leading core technologies such as medical semantic computing, text understanding, knowledge reasoning, and data mining, it has built an artificial intelligence medical solution system to meet the broad needs of medical industry practitioners such as primary medical institutions, hospitals, patients, and residents. It covers the entire medical process from disease warning, early screening, diagnosis, treatment and efficacy evaluation, to post-diagnosis and chronic disease management.
Magnesium Chemical Completes USD 26 Million Series A Financing
Shanghai Meirui Technology Co., Ltd. recently completed a US$26 million Series A financing round, led by Qiming Venture Partners and LYFE Capital, with Sinovation Ventures and Mega Technology following suit. The financing funds will be used to further improve product research and development, expand the commercial market, and support international layout.
Magnesium Chemical was founded in January 2022 and was incubated by Mega and completed angel round investment. The company was founded by a team with an international interdisciplinary background and is committed to using automated and intelligent platforms to provide a new generation of chemical synthesis CRO services for new drug development customers, significantly shortening the delivery cycle of chemical synthesis in the new drug development cycle, and significantly reducing the costs related to chemical synthesis, getting rid of the current situation where drug synthesis is highly dependent on manual operations.
Academic journal Science uses AI tools to detect fake images in journals
Science has deployed the Proofig platform and has been conducting trials for several months, with clear evidence that problematic data, such as manipulated images to mislead readers, can be detected before a paper is published. In addition, Science also uses the detection tool together with text plagiarism detection software to replace manual review.
Tools and Resources
Huawei and the University of Hong Kong open source geometric mathematical model G-LLaVA
At present, multimodal large language models are still unable to accurately parse the basic elements and their relationships in geometric figures. To solve this problem, Huawei Noah's Ark Lab, the University of Hong Kong, and the Hong Kong University of Science and Technology jointly open-sourced the professional geometric mathematics model G-LLaVA. In order to test the performance of G-LLaVA, researchers conducted an in-depth evaluation with other large models on the well-known mathematical testing platform MathVista. The results show that the performance of G-LLaVA exceeds that of models such as GPT-4-V, LLaVA1.5, and MiniGPT-4.
Open source address:
https://github.com/pipilurj/G-LLaVA
Paper address:
https://arxiv.org/abs/2312.11370
Shanghai AI Laboratory Open Source Medical Model Group "Puyi 2.0"
Recently, Shanghai AI Laboratory and partners such as Ruijin Hospital affiliated to Shanghai Jiao Tong University School of Medicine jointly released the medical multimodal basic model group "OpenMEDLab2.0", realizing the one-stop open source of "production, learning, research, application and evaluation" of the medical large model group, aiming to provide capability support for "cross-domain, cross-disease and cross-modality" AI medical applications.
Open source address:
github.com/OpenMEDLab
China's first medical specialty reasoning dataset RJUA-QA is now open source
Ant Group and the urology expert team of Shanghai Renji Hospital jointly developed the first Chinese medical specialty question-answering reasoning dataset RJUA-QA based on the clinical experience of the doctor team and by constructing simulated case data. The dataset consists of three parts: training, validation, and testing, and contains 2,132 QA pairs. The context comes from the Chinese Guidelines for the Diagnosis and Treatment of Urology and Men's Diseases. The disease types cover more than 97.6% of urology patients, and can truly reproduce the diagnosis and treatment scenarios.
Dataset address:
http://openkg.cn/dataset/rjua-qadatasets
paperai: medical/scientific literature discovery and review engine
Paperai is an AI-driven literature discovery and review engine for medical/scientific papers. The tool runs queries to filter out papers that meet specific criteria, and performs a report function based on question-answer extraction technology to find answers to key questions from a set of medical/scientific papers. Paperai has been used to analyze the COVID-19 Open Research Dataset (CORD-19) and won multiple awards in the CORD-19 Kaggle Challenge.
Tool address:
https://paperpal.com/paperpal-for-researchers
DeepKE: Zhejiang University's open source Chinese knowledge graph extraction tool based on deep learning
DeepKE is an open source and scalable knowledge graph extraction tool that supports conventional full supervision, low-resource few-sample and document-level scenarios, covering various information extraction tasks including named entity recognition, relation extraction and attribute extraction. Through a unified framework, DeepKE allows developers and researchers to customize datasets and models and extract information from unstructured text according to their needs.
Tool address:
http://openkg.cn/tool/deepke
ResGen: A 3D molecular generation model based on protein pocket perception
The research team of Zhejiang University and Zhijiang Laboratory proposed a 3D molecular generation model based on protein pockets, ResGen, which is used to design organic molecules within a given target. ResGen has higher computational efficiency, about 8 times faster than the current best technology, and has a higher success rate in generating new molecules than the current best method.
Open source address:
https://github.com/HaotianZhangAI4Science/ResGen
Research results
Generative AI generates new chemical reactions in 6 seconds
Accurate transition state generation with an object-aware equivariant elementary reaction diffusion model

* Source: Nature Computational Science
* Field: Chemical Science, Machine Learning
* Author: MIT team
Researchers have developed an alternative method based on machine learning that can discover transition states of chemical reactions in seconds. The new model can assist chemists in exploring and designing new reactions and catalysts to generate useful products with high added value, such as fuel compounds or drugs. In addition, the model can also simulate naturally occurring chemical reactions.
Read the original article:
https://www.nature.com/articles/s43588-023-00563-7
Fast classification model for retired batteries based on federated learning
Collaborative and privacy-preserving retired battery sorting for profitable direct recycling via federated machine learning

* Source: Nature Communications
* Field: Communication Science, Machine Learning
* Author: Zhang Xuan and Zhou Guangmin from Tsinghua University Shenzhen International Graduate School
The research team established a rapid classification model for retired batteries based on federated learning, which can achieve accurate classification of retired battery positive electrode materials with only a small amount of field test information without the need for historical operating data.
Read the original article:
https://doi.org/10.1038/s41467-023-43883-y
UniKP: a unified framework for predicting enzyme kinetic parameters
UniKP : a unified framework for the prediction of enzyme kinetic parameters

* Source: Nature Communications
* Field: Biotechnology, Language Modeling
* Author: Chinese Academy of Sciences team
The researchers developed an enzyme kinetic parameter prediction framework (UniKP) based on a pre-trained large language model and a machine learning model. This framework can predict a variety of different enzyme kinetic parameters using only the amino acid sequence of a given enzyme and the structural information of its substrate.
Read the original article:
https://www.nature.com/articles/s41467-023-44113-1
DeepProSite: Identifying protein binding sites
DeepProSite : structure-aware protein binding site prediction using ESMFold and pretrained language model

* Source: Bioinformatics
* Field: Biomedicine, Language Model
* Author: Team from Shanghai Jiao Tong University and Sun Yat-sen University
DeepProSite uses protein structure and sequence information to identify protein binding sites. It generates protein structures from ESMFold and sequence representations from a pre-trained language model, and uses Graph Transformer to formulate binding site predictions as graph node classifications.
Read the original article:
https://academic.oup.com/bioinformatics/article/39/12/btad718/7453375
Upcoming Events
ALCF training: Supercomputer basics to promote AI for research

"Introduction to AI-Driven Science on Supercomputers" is hosted by the Argonne Leadership Computing Facility (ALCF) and is a series of free online events that will be divided into two parts: lectures and practice. The course settings are:
* Week 1: Introduction to supercomputers
* Week 2: Introduction to Neural Networks
* Week 3: Further exploration of neural networks
* Week 4: Introduction to Large Language Models
* Week 5: Embedding and tokenization of large language models
* Week 6: Parallel training methods for AI
Registration link:
https://www.alcf.anl.gov/alcf-ai-science-training-series?ct=t(EVT-ALCFINTROTOAI_01092024)
The above is all the content that "Science AI Weekly" wants to share~
If you have the latest research results, first-hand information about companies, etc. about AI for Science, please leave a message "Revelation".