HyperAI

Google Develops Odor Recognition AI Based on GNN, Which Is Equivalent to 70 Years of Continuous Work by Human Evaluators

2 years ago
Information
Xuran Zhang
特色图像

Contents at a glance:Smells are always around us. However, it is difficult for us to describe them accurately. Recently, Osmo, a subsidiary of Google Research, developed an odor analysis AI based on graph neural networks. It can predict the smell of molecules based on the structure of chemical molecules. Based on this AI, researchers mapped the main odor spectrum and established a mapping between chemical structure and odor, which is expected to provide a new method for perceptual research.

Keywords:Odor Analysis GNN Odor Spectrum

Author | Xuecai

Editor | Sanyang

This article was first published on HyperAI WeChat public platform~

A fundamental problem in neuroscience research is mapping the physical properties of external stimuli into sensory perceptions.

In vision, color is a mapping of wavelength. In hearing, tone is a mapping of frequency.But in the sense of smell, the mapping between odors and substances is difficult to establish.

Currently, we can only extract some basic smells, draw a fragrance wheel, and then use these basic smells to form more complex smells.

Figure 1: Schematic diagram of the odor wheel

However, this rough classification is difficult to use for scientific research. Although technologies such as odor sensors have been used to monitor odors, these sensors can only identify specific odors.Existing odor identification often still requires the participation of odor evaluators, a process that is time-consuming and has poor repeatability.

Recently, Osmo, a branch of Google Research, developed an odor analysis AI based on graph neural networks (GNN).It can describe the smell of a chemical molecule based on its structure.This model outperforms humans in judging 53% of chemical molecules and 55% of odor descriptors.Finally, the researchers used this model to draw the main odor spectrum map POM (Principle Odor Map).This result has been published in Science.

Related research has been published in Science

Paper link:

https://www.science.org/doi/full/10.1126/science.ade4401

Experimental procedures

GNN models are stable across multiple architectures

Smell is essentially people's perception of chemical molecules in the air.Therefore, the structure of chemical molecules will affect the smell.In GNN, the structure of chemical molecules is analyzed and integrated to form a graph representing the entire molecule.

After the molecular structure is input into the model,GNN optimizes the weights of different chemical structures in a particular odor,Finally, the odor of the molecule is judged through the prediction layer and the corresponding odor description word is output.

Figure 2: Schematic diagram of the GNN model

Combining the Good Scents and Leffingwell & Associates databases (GS-LF databases),The researchers selected 5,000 molecules as a database for the model.Each molecule can be described by multiple odors, such as cheese, fruity, etc.

Figure 3: Some molecules in the GS-LF database

Subsequently, the GS-LF database was divided into training and testing sets in a ratio of 8:2, and the training set was further divided into five cross-validation subsets.

The Bayesian optimization algorithm was used to cross-validate the data and optimize the hyperparameters of the GNN model.After optimization, the GNN model performed stably in multiple architectures, with the highest AUROC reaching 0.89 in the cross-validation set.

GNN models outperform humans in odor prediction

To verify the model's ability to distinguish other molecules, the researchers conducted odor tests on the GNN model and a human group.

Figure 4: Judgment of the odor of 2,3-dihydrobenzofuran-5-carboxaldehyde by different models

A: GNN model;

B: RF model;

C: Human group;

D: Evaluation of the odor of 2,3-dihydrobenzofuran-5-carbaldehyde by different evaluators.

For the molecule 53%, the odor prediction results of the GNN model were better than the median of the human group.The most advanced algorithm, the random forest model (RF) based on count-based fingerprint (cFP), outperformed the human group only in the molecular odor prediction of 41%.

Figure 5: Correlation of predictions from different models with the average of the human group

The researchers then classified the GNN model's predictions by odor descriptors. Except for musk, the GNN model's predictions for molecular odors were all within the human group's error distribution.And it outperforms the median of the human group in the prediction results of 30 odor descriptors.

Figure 6:Judgment results of GNN model, RF model and human group on different molecules

The prediction results of the GNN model are affected by the structure of the moleculeTherefore, the GNN model has a higher prediction accuracy for sulfur-containing garlic smell and amine-containing fish smell. Musk contains at least five different structures: macrocyclic, polycyclic, nitro, steroidal and linear, so the GNN model has the worst prediction results.

The performance of the human group was affected by familiarity.They were more consistent in their judgments on common food aromas such as nuts, garlic, and cheese, but had greater differences in their judgments on musk and hay.

At the same time, the number of descriptors in the training set will also affect the GNN model's prediction of a certain odor.When the number of occurrences is high enough, the GNN model can make relatively accurate predictions of complex structures, such as fruity, floral, and sweet flavors.

Figure 7: Effect of training data on the correlation between the GNN model prediction results and the human group average

However,For flavors that appear less frequently, the accuracy of the GNN model is polarized.The prediction accuracy for fishy smell, mint and camphor was high, but the judgment for ozone, acetic acid and fermented taste was poor.

GNN model draws the main odor spectrum

After verifying the performance of the GNN model, the researchers further used it in different olfactory tasks.

First, they tested the model's ability to identify molecules with similar structures.Once the model knows the smell of a molecule, it needs to judge the smells of molecules with similar structures but different smells and molecules with different structures but similar smells.For this abnormal structure-odor relationship, the GNN model has a judgment accuracy of 50%, while the RF model has only 19%.

Figure 8: A group of triplets whose structures or smells are close to known molecules

After obtaining a stable structure-odor relationship, the researchers began to try to draw a large-scale odor spectrum.They completed the Primary Odor Map (POM) for approximately 500,000 molecules.These molecules are still unknown in the scientific field, and most of them have not even been synthesized.

However, their positions in the spectrogram can be directly calculated by the GNN model, so a large-scale odor spectrum can be drawn.If a trained human evaluator were to assess the smell of these molecules, it would take about 70 years of continuous work.

Figure 9: Main odor spectrum

In the figure, the coordinates of each molecular odor are determined by the GNN model, and the RGB value of its color corresponds to the coordinates of the first three dimensions in the predicted odor matrix.

The Proust Effect: The Link between Smell and Memory

When we smell a certain scent, it reminds us of a past memory, and the scent makes that memory more vivid and emotional.The writer Marcel Proust mentioned in "Remembrance of Things Past" that when the narrator smelled the scent of madeleine cakes soaked in tea, "the past came to mind." Therefore, this phenomenon is also called the Proust effect.

The sense of smell is more closely connected to memory in the nervous system than any other sense.It is the only sensory system that is directly connected to the emotional and memory areas of the brain.When the olfactory cells are activated, nerve impulses are sent directly to the piriform cortex, a brain region that includes the amygdala, which is responsible for fear and other emotions, and the parahippocampal gyrus, which is responsible for memory.

Figure 10: Components of the olfactory circuit

Primary olfactory cortex: primary olfactory cortex;

Amygadala: amygdala;

Hippocampus: seahorse.

Because of the close connection between smell, memory and emotion, perfume has become a must-have for people to go out and meet. Maybe the other person can't call your name when he sees you again, but when he smells this scent, he will definitely remember the scene of meeting you.

With the help of AI, people have a deeper understanding of the connection between molecular structure and smell.Maybe one day, we can really mix the flavor that we are most familiar with. Open the bottle cap and you can take a time machine and let your memories go back to the past.

Reference Links:

[1] https://perfumersupplyhouse.com/2014/01/09/fragrance-creation-wheels-for-you/

[2] https://www.slideserve.com/cora-schroeder/functional-neuroanatomy

This article was first published on HyperAI WeChat public platform~