HyperAI

Tsinghua University Leads the Release of the Uni-MOF Model, Which Effectively Identifies 630,000 Three-dimensional Spatial Configurations and Predicts the Adsorption Capacity of MOFs

特色图像

In the industrial world, high-purity gases are widely used in many fields such as semiconductor manufacturing, optical fiber production, scientific research, medical health, environmental protection and energy. For example, in the semiconductor industry, high-purity gases are key raw materials for chip manufacturing, which directly affects the performance and yield of integrated circuits.

The key challenge in preparing high-purity gas is gas separation. Common gas separation methods include cryogenic method (distillation principle), adsorption method (molecular polarity), membrane method (membrane filtration), etc. Among them, metal organic frameworks (MOFs) have shown great application potential in gas adsorption storage and separation due to their highly ordered pore structure and adjustable pore size.Some predict that MOFs may be as important to the 21st century as plastics were to the 20th century.

However, accurately predicting the adsorption capacity of MOFs still faces many challenges. To address this issue, the team of Professor Lu Diannan from the Department of Chemical Engineering at Tsinghua University, in collaboration with Professor Wu Jianzhong from the University of California, Riverside, and Researcher Gao Zhifeng from the Beijing Institute of Scientific and Intelligent Technology, recently published a new paper titled "A comprehensive transformer-based approach for high-accuracy gas adsorption predictions in metal-organic frameworks" in Nature Communications.

This study proposes a machine learning model Uni-MOF for predicting the adsorption behavior of three-dimensional MOF materials, which is used to predict the adsorption performance of nanoporous materials for various gases under various working conditions.This is a major breakthrough in the application of machine learning technology in the field of materials science.

Research highlights:

* The Uni-MOF framework is a versatile solution for predicting the gas adsorption capacity of MOFs under different conditions

* Uni-MOF can not only recognize and restore the 3D structure of nanoporous materials through pre-training, but also further considers operating conditions such as temperature, pressure and different gas molecules, which makes it suitable for both scientific research and practical applications

* By leveraging adsorption data of other gases, Uni-MOF accurately predicts the adsorption performance of unknown gases

Paper address:
https://www.nature.com/articles/s41467-024-46276-x 

Follow the official account and reply "adsorption" to get the complete PDF

Dataset: existing database + program generated data

In this study, the MOF/COF structures used for pre-training mainly come from two aspects—collected from currently available databases or generated using corresponding programs.

There are currently a large number of MOF/COF databases, including the computationally synthesized hMOFs50 database, the topology-based crystal construction program (ToBaCCo) MOFs, and the experimental-level CoRE (Computational Ready Experiments) MOFs51, CoRE COFs52 and CCDC (Cambridge Crystallographic Data Center).

In addition, more than 168,000 MOF/COF structures are available in the online integrated database MOFXDB. In addition to exploring nanoporous materials in the materials library, the researchers also used the ToBaCCo.3.0 program to generate more than 306,773 MOF structures.

For the downstream task, i.e., gas adsorption by MOFs, the researchers collected data from online sources such as MOFXDB, forming a dataset of more than 2.4 million hMOFs for five gases (CO2 , N2. CH4 , Kr, Xe) at 273/298 K and 0.01–10 Pa, and the adsorption data of more than 460,000 CoRE MOFs for two gases (Ar, N2) Adsorption data set at 77/87 K and 1–105 Pa.

In addition, the researchers performed Grand Canonical Monte Carlo (GCMC) 53 simulations using RASPA54 software, generating an additional 99,000 gas adsorption data sets, including 50,000 initialization cycles and an additional 50,000 cycles for adsorption capacity samples. The adsorption data collected were obtained in the range of 150–300 K and 1 Pa–3 bar, considering seven gas molecules (CH4 、 CO2 、 Ar、 Kr、 Xe、 O2 , He).

Model framework: pre-training + multi-task prediction fine-tuning

The Uni-MOF framework includes pre-training on three-dimensional nanoporous crystals and fine-tuning for multi-task prediction in downstream applications.

Schematic overview of the Uni-MOF framework

During the pre-training phase of the model,The researchers implemented two types of tasks to improve model performance.

The first task is to predict the type of masked atoms, that is, to identify and predict the type of atoms in the masked part of the molecular structure. The second task is to perform the three-dimensional coordinate recovery task under noise. The specific operation is to introduce uniform noise in the range of [-1Å, +1Å] on the atomic coordinates of 15%, and then calculate the spatial position encoding based on these damaged coordinates.

These two types of tasks are designed to enhance the model's ability to resist data interference, thereby providing more accurate performance when facing subsequent prediction tasks.

During the fine-tuning phase,The researchers used approximately 3 million labeled data points covering MOFs and COFs under a wide range of adsorption conditions to achieve accurate predictions of adsorption capacity.

Through a diverse database of cross-system target data, the fine-tuned Uni-MOF is able to predict the multi-system adsorption performance of MOFs in arbitrary states, including different gases, temperatures, and pressures. Therefore, Uni-MOF is a unified and easy-to-use framework for predicting the adsorption performance of MOF adsorbents.

Research findings: Uni-MOF framework has broad applications in materials science

First, the researchers validated the predictive power of Uni-MOF.

The prediction results show that when applied to databases with sufficient data and relatively concentrated operating states, such as hMOF_MOFX_DB and CoRE_MOFX_DB, Uni-MOF exhibits very high robustness, with R² values of 0.98 and 0.92, respectively. On the widely distributed dataset CoRE_MAP, Uni-MOF's prediction accuracy reaches 0.83, which can still achieve excellent prediction accuracy, demonstrating its good generalization ability.

Overall performance of Uni-MOF in large-scale databases

Second, the researchers compared Uni-MOF's predicted results with experimentally collected results.

The researchers found that the Uni-MOF framework was able to accurately screen high-performance adsorbents based solely on the predicted adsorption capacity under low-pressure conditions. It is worth noting that many of its predicted values under low-pressure conditions deviated significantly from the experimental values, especially in the case of Mg-dobdc and MOF-5. But even so, the Uni-MOF framework's predictive accuracy is still among the best among many materials, making it suitable for solving engineering challenges.

Adsorption isotherms based on low-pressure predictions and high-pressure experimental values
Each curve represents a Langmuir fit

Third, the researchers validated the predictive power of Uni-MOF in cross-system properties.

The test results show that Uni-MOF is robust in predicting the adsorption capacity of unknown gases, achieving a high prediction accuracy (R²) of 0.85 for krypton and a prediction accuracy of more than 0.35 for all unknown gases. Compared with single system tasks, the Uni-MOF framework shows superior performance on cross-system datasets and can accurately predict the adsorption performance of unknown gases, demonstrating its strong predictive ability and universality.

Uni-MOF cross-system prediction case

In addition, to evaluate the model's ability in structural recognition, the researchers used hMOF-5004238 as an example to analyze the interatomic interactions within the material structure.Prove the effectiveness of Uni-MOF in identifying more than 630,000 three-dimensional spatial configurations and their atomic connections.This highlights the versatility and broad application prospects of the model.

In summary, the Uni-MOF framework is a versatile prediction platform for MOF materials. As a gas adsorption predictor for MOFs, it exhibits high accuracy in predicting gas adsorption under various operating conditions and has broad applications in the field of materials science. More importantly, Uni-MOF has achieved a significant breakthrough in the application of machine learning techniques in the field of materials science.

Discovery - Design - Optimization, AI Accelerates Materials Science

Materials science is an important discipline that is concerned with the discovery, design, and manufacture of new materials. It plays an extremely important role in various fields. From healthcare to energy storage, from environmental protection to information technology, the development of materials science is crucial to solving the various challenges facing today's society.

With the continuous advancement of technology, we are in an era of material science revolution. The emergence of new materials provides humans with new ways and new tools to solve problems. With a deeper understanding of material properties and structure, we are expected to create lighter, stronger and more energy-efficient materials.

Artificial intelligence technology can accelerate the discovery of new materials, improve material performance, and reduce R&D costs. In recent years, it has shown great application potential in the field of materials science.

* Material Discovery and Design:

Artificial intelligence technology can accelerate the discovery and design process of new materials through efficient data mining and pattern recognition. For example, machine learning algorithms can be used to analyze the structure and properties of a large number of known materials to predict new materials with specific properties. This method can greatly shorten the time for material screening and reduce experimental costs.

At the end of November 2023, Google DeepMind published a paper in Nature magazine stating that it had developed an artificial intelligence reinforcement learning model Graph Networks for Materials Exploration (GNoME) for materials science, and through this model and high-throughput first-principles calculations, it found more than 380,000 thermodynamically stable crystalline materials, which is equivalent to "nearly 800 years of knowledge accumulation by human scientists", greatly accelerating the research speed of discovering new materials.

(Click here for detailed report: DeepMind releases GNoME, using deep learning to predict 2.2 million new crystals)

* Material performance prediction:

Artificial intelligence technology can build efficient prediction models to predict the performance and behavior of materials. These models can be trained based on a large amount of experimental data or simulation results to provide accurate predictions of material properties. For example, machine learning algorithms can be used to predict the mechanical properties, thermal properties, and electronic structure of materials, providing important references for material design and application.

* Material optimization and design:

Artificial intelligence technology can improve the performance and stability of materials by intelligently optimizing the structure and properties of materials. For example, the use of reinforcement learning algorithms can achieve automatic optimization during the material preparation process, thereby maximizing the performance of materials.

* Material process control and monitoring:

Artificial intelligence technology can be used to optimize the material preparation process and realize intelligent monitoring and control of the material production process. For example, machine learning algorithms can be used to analyze various parameters and conditions in the material preparation process, optimize the process flow, and improve production efficiency and material quality. At the same time, artificial intelligence technology can also realize real-time monitoring and early warning of the material production process, help discover and solve potential problems in advance, and reduce production risks.

The application of artificial intelligence technology in the field of materials science has made a series of important progress, providing new ideas and methods for the discovery, design, optimization and preparation of materials. In the future, scientists can use AI technology to better predict the performance of materials, simulate the structure of molecules, optimize the design of materials, explore the properties of materials, etc., thereby continuously promoting progress and innovation in the field of materials science.

References:
1.https://www.nature.com/articles/s41467-024-46276-x#Sec11
2.https://www.sohu.com/a/753459278_661314
3.https://www.tsinghua.edu.cn/info/1175/110086.htm