ProteinGym Protein Mutation Dataset
The dataset contains a total of approximately 1.5 million missense variants from 87 DMS sequencing experiments.
paper"Enhancing efficiency of protein language models with minimal wet-lab data through few-shot learning"Using this dataset as a benchmark dataset, the results have been published in Nature Communications, a subsidiary of Nature
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.