Based on 944 Material Data, Tohoku University and MIT Jointly Released the GNNOpt Model, Which Successfully Identified Hundreds of Solar Cell and Quantum Candidate Materials.

Optoelectronic devices such as LEDs, solar cells, photodetectors and photonic integrated circuits (PICs) are at the heart of modern communications, lighting and energy conversion technologies.The performance and efficiency of these devices depend heavily on the optical properties of the materials, and a deep understanding of these properties is therefore critical to driving technological advances and meeting the growing scientific and industrial demands. To address this challenge, researchers in both experimental and computational fields are actively conducting high-throughput screening efforts to search for and develop novel materials with tailored optical properties.
However, traditional experimental techniques for obtaining optical properties of materials, such as ellipsometry, UV-visible spectroscopy, and Fourier transform infrared spectroscopy (FTIR), can provide accurate measurement results, but they are usually only applicable to specific wavelength ranges and have strict requirements on sample conditions. These limitations restrict the application of these techniques in high-throughput material screening.
To address this problem, the researchers turned to first-principles calculations based on density functional theory (DFT).Compared with traditional experimental techniques, DFT calculations can cover the entire wavelength range of the optical spectrum, providing a more comprehensive analysis method. Despite the powerful computing power of DFT, there are still certain challenges in predicting the optical properties of crystal structures due to the lack of effective atomic embedding.
In response to this, researchers from Tohoku University and MIT launched a new artificial intelligence tool, GNNOpt, which successfully identified 246 materials with solar energy conversion efficiencies exceeding 32%, and 296 quantum materials with high quantum weights.It has greatly accelerated the discovery of energy and quantum materials and brought a new research paradigm to the field of materials science.
The related research was published in Advanced Materials under the title “Universal Ensemble-Embedding Graph Neural Network for Direct Prediction of Optical Spectra from Crystal Structures”.
Research highlights:
* GNNOpt uses "integrated embedding" technology, which can not only learn information from multiple datasets, but also accurately predict all linear optical spectra directly from the crystal structure
* By integrating equivariant neural networks, GNNOpt achieves high-quality predictions on a small dataset of 944 materials
* GNNOpt successfully screened 246 materials with solar energy conversion efficiency exceeding 32% from unknown materials, and 296 quantum materials with high quantum weight, including SiOs

Paper address:
https://onlinelibrary.wiley.com/doi/epdf/10.1002/adma.202409175
Follow the official account and reply "Optical Properties Prediction" to get the full paper PDF
The open source project "awesome-ai4s" brings together more than 100 AI4S paper interpretations and provides massive data sets and tools:
https://github.com/hyperai/awesome-ai4s
Dataset: Small sample learning based on 944 crystal materials
The researchers used 944 crystalline materials derived from density functional theory (DFT) calculations to make spectral predictions for the GNNOpt model.These databases are obtained from the Materials Project via an API. The spectral data in the databases are obtained using the Independent Particle Approximation (IPA) and include the frequency-dependent dielectric function and its corresponding absorption coefficient.
The entire dataset was randomly divided into a training set (733 materials), a validation set (97 materials), and a test set (110 materials) in the ratio of 80%, 10%, and 10%.

GNNOpt model architecture: Directly linking crystal structure to frequency-dependent optical properties
GNNOpt is a graph neural network (GNN)-based model that uses "ensemble embedding" technology to predict all linear optical spectra directly from crystal structures.It is worth noting that before training the GNNOpt model,The researchers conducted a series of experiments to demonstrate that the application of the Kramers–Kronig relations can better predict optical spectra.
As shown in Figure a below,The only input to GNNOpt is the crystal structure, and the output is a spectrum.Specifically, it includes complex dielectric function, absorption coefficient, complex refractive index and reflectance.

In Figure b,The input features of each atomic species (O, CI, TI) in the crystal structure are represented by one-hot encoding.Since all elements in the periodic table have atomic masses (denoted by x0 denoted by x) and dipole polarizability (denoted by x)1 effective covalent radius, x2 Therefore, the researchers selected these three features for integrated embedding.

By introducing an integrated embedding layer with automatic embedding optimization, the researchers were able to improve the model's prediction accuracy without modifying the neural network structure. The specific process is shown in Figure c below.
First, all atomic input features are automatically optimized through an ensemble embedding layer. To achieve equivariance, the convolution filter consists of learnable radial functions and spherical harmonics. Then, the embedded features are parameterized into input parameters through a series of equivariant graph convolutions and gated nonlinear layers. Next, the parameter results are passed to a post-processing layer including activation and aggregation operations to generate a predicted output spectrum. Finally, the GNNOpt weights are trained and optimized by minimizing the mean square error (MSE) loss function between the predicted spectrum and the true spectrum.

In order to gain a deeper understanding of the crystal structure, the researchers analyzed the unit cell structure of TlClO4, as shown in Figure d.The circular nodes represent atoms in the unit cell, and the lines represent the direction of information transfer in the graph convolutional layer.

Figure e shows the details of the generic ensemble embedding layer, which is the key factor in improving the performance even without any changes to the neural network model.For each atom, each feature is embedded in its linear and activation layers independently. Then, all embedded features are mixed by a learnable mixing probability pi Perform weighted averaging, where pi By ∑ipi = 1 for normalization.

Model performance: GNNOpt identifies hundreds of solar cell and quantum material candidates
To test the performance of the GNNOpt model, the researchers used GNNOpt to identify solar cell materials and quantum materials, and successfully identified 246 solar cell materials and 296 quantum materials with high quantum weights.
Details of the above materials are given in the additional information:
https://go.hyper.ai/rVSS8
Follow the official account and reply "Forecast Material Information" to get the complete PDF
GNNOpt can screen 246 solar cell materials from unknown materials
In identifying potential solar cell materials with high-performance energy conversion functions, researchers used the Spectroscopic Limited Maximum Efficiency (SLME) method to conduct preliminary screening and evaluation of the photoelectric conversion efficiency of solar cells.
Subsequently, the researchers used the GNNOpt model to predict the energy conversion efficiency (η value) of 5,281 unknown crystal structures in the Materials Project. It should be noted that these crystal structures do not have real spectral data. As shown in Figure a below, the researchers compared the predicted efficiency of the test set with the actual efficiency, and the result showed R² = 0.81, indicating that GNNOpt has a high prediction accuracy for the photoelectric conversion efficiency of solar cells.

In Figure b, the researchers plotted the energy band gap (E) between the efficiency η predicted by GNNOpt and the actual efficiency η obtained by DFT in the test set.g represents the functional relationship graph of ). When Eg The maximum value of η is about 32% at about 1.3 eV, which is consistent with the SQ limit. However, SLME is more stringent than the SQ limit as a selection parameter for solar cell materials because for materials with similar band gaps, SLME shows a wide range of η values, indicating that the absorption coefficient α(E) has a significant contribution to η.

In addition, understanding which elements in the periodic table contribute most to efficient solar cell materials can provide preliminary guidance for material design.The GNNOpt model predicts that transition metals (such as Tc, Rh, Pd, Pt, Cu, Ag, Au and Hg) and chalcogenides (such as S, Se and Te) are the main components of solar cell materials.This result is consistent with well-known solar cell materials such as Cu-rich chalcopyrite, Pb-based perovskites or CdTe.

To verify the SLME prediction value of the GNNOpt model for unknown materials, the researchers selected three examples from the list of the highest SLME materials: LiZnP, SbSeI, and BiTeI. It should be noted that these materials are not in the DFT database. Therefore, the researchers performed DFT calculations on these materials to determine the absorption coefficients α(E) of these materials.
The results are shown in Figure d below. The results of DFT calculation (represented by the dotted line) are highly consistent with the α value predicted by GNNOpt (represented by the solid line).This shows that GNNOpt can be an effective material screening tool with significantly reduced computational costs. It is worth mentioning that for large databases, GNNOpt can be combined with genetic algorithms (GA) to accelerate the search process for candidate materials.

GNNOpt successfully detected 296 quantum materials including SiOs
In addition to identifying unknown solar cell materials with high-performance energy conversion potentialAnother application of GNNOpt is to detect quantum geometry and topology in quantum materials.Previously, some scholars have shown that the concept of generalized quantum weight can be derived from the spectrum and is a direct indicator of the ground state quantum geometry and topology.xx It is a modification of the inverse frequency-weighted f-sum rule.
* Quantum weight Kxx It is an important physical quantity in quantum systems related to the optical and electronic properties of materials, especially used to measure their quantum geometric and topological properties. It describes the relationship between the quantum geometric structure of a material and its optical or electrical properties.
In Figure a, the researchers compared the predicted K in h/e² on the test set.xx and the real Kxx The value of Kxx < 25, R² = 0.73, indicating that the GNNOpt prediction results are close to the actual results of DFT calculations.

Therefore, GNNOpt was used to predict the K values of 5,281 unknown insulating materials.xx To simplify the analysis,The researchers transformed the famous topological insulator Bi2Te3 The quantum weight Kxx = 28.87 as the threshold for classifying quantum materials, where Kxx Materials > 28.87 are considered high Kxx Material.
Ultimately, the researchers identified 297 high-Kxx materials. Some of these materials, such as ZrTe5 (Kxx = 33.90) 、 TaAs2 (Kxx = 37.66) 、 FeSi (Kxx = 48.74) and NbP (Kxx = 35.58), etc., and have been confirmed as quantum materials with anomalous Hall effect, large magnetoresistance, topological Fermi arcs and quantum oscillations.

Since SiOs has a very high quantum weight (Kxx = 46.52), and has not been studied in depth before, so the researchers performed additional DFT calculations on SiOs and analyzed its electronic band structure. As shown in Figure c,SiOs has three-fold fermions and double-Weyl fermions at the Γ point and R point respectively.

Figure d shows that the researchers used the maximum localized Wannier functions and Green's function methods to calculate the band structure of the SiOs (001) surface, indicating the super quantum properties of SiOs.

Artificial intelligence will reshape the material research and development process, and materials will be generated in reverse
In the rapid development of materials science, AI technology is leading a revolutionPreviously, Gan Yong, an academician of the Chinese Academy of Engineering, publicly stated that "artificial intelligence will reshape the material research and development process, and materials will be generated in reverse."
First, the application of AI in materials discovery is particularly significant. At the end of November 2023, Google's DeepMind released GNoME, an AI reinforcement learning model for materials science.Through this model and high-throughput first-principles (DFT) calculations, more than 380,000 thermodynamically stable crystalline materials have been found, greatly accelerating the research speed of discovering new materials.
Click to view detailed report: 800 years ahead of humans? DeepMind releases GNoME, using deep learning to predict 2.2 million new crystals
Paper address:
https://www.nature.com/articles/s41586-023-06735-9
Not to be outdone, Microsoft released MatterGen, an AI-generated model for materials science, a few days after the GNoME model was published.New material structures can be predicted on demand based on the required material properties.
Paper address:
https://arxiv.org/abs/2312.03687
In January 2024, Microsoft collaborated with the Pacific Northwest National Laboratory (PNNL) under the U.S. Department of Energy to use artificial intelligence and high-performance computing to screen out an all-solid-state electrolyte material from 32 million inorganic materials.This technology has completed a closed loop from prediction to experiment and can help develop next-generation lithium-ion battery materials.
Paper address:
https://arxiv.org/abs/2401.04070
In addition, AI also plays an important role in predicting material properties. Through machine learning models, the electronic structure and mechanical properties of materials can be predicted, thereby optimizing material design. For example,ABACUS, a domestic open source density functional theory software developed by Chen Mohan, a researcher at the School of Engineering of Peking University,Combined with the AI-assisted exchange-correlation functional method DeePKS, the dilemma between accuracy and efficiency of DFT calculations is overcome, and high-efficiency hybrid functional accuracy calculations are achieved.
Paper address:
https://pubs.acs.org/doi/10.1021/acs.jpca.2c05000
The application of AI in the field of materials science goes far beyond this. At the implementation level, there are also companies such as Green Dynamics, CuspAl, and DeepVerse that are committed to applying AI to the field of new materials.With the continuous development of technology, AI may unleash unlimited power in the field of materials science!
References:
1.https://mp.weixin.qq.com/s/HBhRoahOVme0eOUNtyvygg
2.https://mp.weixin.qq.com/s/tlwBjmHAPkKKehqMHzDoBw