Weekly Editor's Picks | RJUA-QA Medical Dataset Launched, 3D Molecular Generation Model ResGen Paper Analysis

HyperAI's new column is here~Every Monday, the HyperNeural editorial department will select the content (data sets, AI4S paper cases, encyclopedia entries) updated on the hyper.ai official website in the previous week and publish them here. Welcome to visit hyper.ai directly to view all the content!

From January 15th to January 21st, hyper.ai official website updated quickly:

* High-quality public datasets: 10

* AI4S paper cases: 2

* Popular encyclopedia entries: 10

Visit the official website:https://hyper.ai/

Selected public datasets

1. CrossDock2020:ResGen Datasets processed for research

The initial data of this dataset contains more than 22 million protein-ligand pairs. This dataset can be used for protein-small molecule interaction research, especially for evaluating the binding ability of molecules to protein pockets.

Direct use:

https://hyper.ai/datasets/29021

2. RJUA-QA: The first Chinese medical specialty question answering reasoning dataset

RJUA-QA is an innovative question-answering reasoning dataset for medical urology. The dataset was created by the Ant Group Medical LLM team in collaboration with the urology expert team of Renji Hospital affiliated to Shanghai Jiao Tong University School of Medicine. The dataset was developed to convert real clinical patient data into virtual patient clinical dialogues, presented in the Q-context-A (question-context-answer) format.

Direct use:

https://hyper.ai/datasets/28970

3. MetaMathQA Mathematical Reasoning Dataset

In order to improve the forward and reverse reasoning capabilities of the model, researchers from Cambridge, HKUST, and Huawei proposed the MetaMathQA dataset based on two commonly used mathematical datasets (GSM8K and MATH): a mathematical reasoning dataset with wide coverage and high quality. MetaMathQA consists of 395K forward and reverse mathematical question-answer pairs generated by a large language model.

Direct use:

https://hyper.ai/datasets/28954

4. M³IT Multi-mode Multi-language Instruction Tuning Dataset

The dataset consists of 40 datasets with 2.4 million instances and 400 manually written task instructions, reformatted into a visual-to-text structure. The dataset compiles various tasks of classic visual-language tasks, including captioning, visual question answering (VQA), visual conditional generation, reasoning, and classification.

Direct use:

https://hyper.ai/datasets/29048

5. ChatHaruhi-RolePlaying role-playing dialogue dataset

ChatHaruhi is a dataset containing 32 Chinese/English TV/anime characters and more than 54k simulated dialogues. Role-playing chatbots built with large language models have attracted widespread attention. In order to imitate specific fictional characters, the research team proposed an algorithm to control the language model through improved prompts and memory of characters extracted from scripts. By collecting corpora of movies, novels, and scripts and performing structured extraction, the research team collected more than 23,000 dialogue messages.

Direct use:

https://hyper.ai/datasets/28926

For more updated datasets this week, please visit:

https://hyper.ai/datasets

ScienceAI ArgumentSelected Case Studies

1. 8 times faster than the best technology: Hou Tingjun et al. from Zhejiang University proposed ResGen, a 3D molecular generation model based on protein pockets

Zhejiang University andZhijiang LaboratoryThe research team proposed a 3D molecule generation model based on protein pockets, ResGen, which is 8 times faster than the previous best technology and successfully generated drug-like molecules with lower binding energy and higher diversity. The paper has been published in the journal "Nature".

View the full report:

https://hyper.ai/news/29026

2. Luo Xiaozhou's team from the Chinese Academy of Sciences proposed the UniKP framework, a large model + machine learning to predict enzyme kinetic parameters with high precision

Luo Xiaozhou's team from the Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, proposed a framework for predicting enzyme kinetic parameters (UniKP) to achieve the prediction of a variety of different enzyme kinetic parameters. The paper has been published in the journal Nature.

View the full report:

https://hyper.ai/news/29000

Popular Encyclopedia Articles

1. Sigmoid function

2. Markov chain (Markov Chain)

3. Cue word attack (Prompt Injection)

4. Reward Model

5. Prompt Engineering

Here are hundreds of AI-related terms compiled to help you understand "artificial intelligence" here:

https://hyper.ai/wiki

The above is all the content of this week’s editor’s selection. If you have resources that you would like to include on the hyper.ai official website, you are also welcome to leave a message or submit an article to tell us!

See you next week!

About HyperAI

HyperAI (hyper.ai) is the leading artificial intelligence and high-performance computing community in China.We are committed to becoming the infrastructure in the field of data science in China and providing rich and high-quality public resources for domestic developers. So far, we have:

* Provide domestic accelerated download nodes for 1200+ public data sets

* Includes 300+ classic and popular online tutorials

* Interpretation of 100+ AI4Science paper cases

* Support 500+ related terms search

* Hosting the first complete Apache TVM Chinese documentation in China

Visit the official website to start your learning journey:

https://hyper.ai/

HyperAI

Weekly Editor's Picks | RJUA-QA Medical Dataset Launched, 3D Molecular Generation Model ResGen Paper Analysis

2 years ago

Information

AI for Science

From January 15th to January 21st, hyper.ai official website updated quickly:

* High-quality public datasets: 10

* AI4S paper cases: 2

* Popular encyclopedia entries: 10

Visit the official website:https://hyper.ai/

Selected public datasets

1. CrossDock2020:ResGen Datasets processed for research

Direct use:

https://hyper.ai/datasets/29021

2. RJUA-QA: The first Chinese medical specialty question answering reasoning dataset

Direct use:

https://hyper.ai/datasets/28970

3. MetaMathQA Mathematical Reasoning Dataset

Direct use:

https://hyper.ai/datasets/28954

4. M³IT Multi-mode Multi-language Instruction Tuning Dataset

Direct use:

https://hyper.ai/datasets/29048

5. ChatHaruhi-RolePlaying role-playing dialogue dataset

Direct use:

https://hyper.ai/datasets/28926

For more updated datasets this week, please visit:

https://hyper.ai/datasets

ScienceAI ArgumentSelected Case Studies

1. 8 times faster than the best technology: Hou Tingjun et al. from Zhejiang University proposed ResGen, a 3D molecular generation model based on protein pockets

View the full report:

https://hyper.ai/news/29026

2. Luo Xiaozhou's team from the Chinese Academy of Sciences proposed the UniKP framework, a large model + machine learning to predict enzyme kinetic parameters with high precision

View the full report:

https://hyper.ai/news/29000

Popular Encyclopedia Articles

1. Sigmoid function

2. Markov chain (Markov Chain)

3. Cue word attack (Prompt Injection)

4. Reward Model

5. Prompt Engineering

Here are hundreds of AI-related terms compiled to help you understand "artificial intelligence" here:

https://hyper.ai/wiki

See you next week!

About HyperAI

* Provide domestic accelerated download nodes for 1200+ public data sets

* Includes 300+ classic and popular online tutorials

* Interpretation of 100+ AI4Science paper cases

* Support 500+ related terms search

* Hosting the first complete Apache TVM Chinese documentation in China

Visit the official website to start your learning journey:

https://hyper.ai/

Weekly Editor's Picks | RJUA-QA Medical Dataset Launched, 3D Molecular Generation Model ResGen Paper Analysis

2 years ago

Information

AI for Science

From January 15th to January 21st, hyper.ai official website updated quickly:

* High-quality public datasets: 10

* AI4S paper cases: 2

* Popular encyclopedia entries: 10

Visit the official website:https://hyper.ai/

Selected public datasets

1. CrossDock2020:ResGen Datasets processed for research

Direct use:

https://hyper.ai/datasets/29021

2. RJUA-QA: The first Chinese medical specialty question answering reasoning dataset

Direct use:

https://hyper.ai/datasets/28970

3. MetaMathQA Mathematical Reasoning Dataset

Direct use:

https://hyper.ai/datasets/28954

4. M³IT Multi-mode Multi-language Instruction Tuning Dataset

Direct use:

https://hyper.ai/datasets/29048

5. ChatHaruhi-RolePlaying role-playing dialogue dataset

Direct use:

https://hyper.ai/datasets/28926

For more updated datasets this week, please visit:

https://hyper.ai/datasets

ScienceAI ArgumentSelected Case Studies

1. 8 times faster than the best technology: Hou Tingjun et al. from Zhejiang University proposed ResGen, a 3D molecular generation model based on protein pockets

View the full report:

https://hyper.ai/news/29026

2. Luo Xiaozhou's team from the Chinese Academy of Sciences proposed the UniKP framework, a large model + machine learning to predict enzyme kinetic parameters with high precision

View the full report:

https://hyper.ai/news/29000

Popular Encyclopedia Articles

1. Sigmoid function

2. Markov chain (Markov Chain)

3. Cue word attack (Prompt Injection)

4. Reward Model

5. Prompt Engineering

Here are hundreds of AI-related terms compiled to help you understand "artificial intelligence" here:

https://hyper.ai/wiki

See you next week!

About HyperAI

* Provide domestic accelerated download nodes for 1200+ public data sets

* Includes 300+ classic and popular online tutorials

* Interpretation of 100+ AI4Science paper cases

* Support 500+ related terms search

* Hosting the first complete Apache TVM Chinese documentation in China

Visit the official website to start your learning journey:

https://hyper.ai/

Command Palette

Weekly Editor's Picks | RJUA-QA Medical Dataset Launched, 3D Molecular Generation Model ResGen Paper Analysis

Command Palette

Weekly Editor's Picks | RJUA-QA Medical Dataset Launched, 3D Molecular Generation Model ResGen Paper Analysis

Related News

Fast and Accurate! Cohere Releases open-source Transcription Model; Accurate Parsing of Complex Scenarios: Chandra-ocr-2 Visual Language Model Achieves Precise OCR.

AI-driven De Novo Design of Diverse small-molecule Binding Proteins: A South Korean Team Discovered a Protein That Can Selectively Recognize Stress hormones.

Dataset Compilation | From Medical imaging/clinical Data to Cell atlas/medical Q&A, 10 Major Datasets Covering Multiple Disease Scenarios

Paper Weekly Report | ProgramBench Enables AI to Write Software From Scratch, With 9 Major Models Failing En Masse; ExoActor Demonstrates Strong Scene Generalization Ability Without Additional real-world Data… A Quick Overview of the week's cutting-edge AI Papers

Online Tutorial | HKU Team Open Sources DeepTutor, a Personal Learning Assistant That Enables Interactive Learning Covering Understanding, Reasoning, and Generation Through Multi-Agent Collaboration

4-step Image output/4K quality/6x Speedup, PiD Uses Pixel Diffusion to Unify Decoding and super-resolution Output; SA-3DAO: a Dataset Containing 1000 Pairs of Real Images Paired With Handcrafted 3D Meshes by artists.

Paper Weekly Report | Microsoft MAI-Thinking Explores self-evolution of Pure RL, Achieving an AIME Accuracy of 97%; VLM³ Achieves 3D Task Generalization Using Plain Text Coordinates Without Architectural Modifications… A Quick Overview of the week's cutting-edge AI Papers

In Just 30 Minutes, the Biological multi-agent Robin Successfully Integrated 550 Research Papers, Establishing an Autonomous Research Loop and Identifying dAMD Candidate therapies.

Leveraging Gemini 1.5's Long Contextual Capabilities, Google's Conversational Healthcare System AMIE Achieved the Reasoning Level of a General Practitioner in 100 Scenarios Involving Multiple Patient visits.

Command Palette

Weekly Editor's Picks | RJUA-QA Medical Dataset Launched, 3D Molecular Generation Model ResGen Paper Analysis

Related News

Fast and Accurate! Cohere Releases open-source Transcription Model; Accurate Parsing of Complex Scenarios: Chandra-ocr-2 Visual Language Model Achieves Precise OCR.

AI-driven De Novo Design of Diverse small-molecule Binding Proteins: A South Korean Team Discovered a Protein That Can Selectively Recognize Stress hormones.

Dataset Compilation | From Medical imaging/clinical Data to Cell atlas/medical Q&A, 10 Major Datasets Covering Multiple Disease Scenarios

Paper Weekly Report | ProgramBench Enables AI to Write Software From Scratch, With 9 Major Models Failing En Masse; ExoActor Demonstrates Strong Scene Generalization Ability Without Additional real-world Data… A Quick Overview of the week's cutting-edge AI Papers

Online Tutorial | HKU Team Open Sources DeepTutor, a Personal Learning Assistant That Enables Interactive Learning Covering Understanding, Reasoning, and Generation Through Multi-Agent Collaboration

4-step Image output/4K quality/6x Speedup, PiD Uses Pixel Diffusion to Unify Decoding and super-resolution Output; SA-3DAO: a Dataset Containing 1000 Pairs of Real Images Paired With Handcrafted 3D Meshes by artists.

Paper Weekly Report | Microsoft MAI-Thinking Explores self-evolution of Pure RL, Achieving an AIME Accuracy of 97%; VLM³ Achieves 3D Task Generalization Using Plain Text Coordinates Without Architectural Modifications… A Quick Overview of the week's cutting-edge AI Papers

In Just 30 Minutes, the Biological multi-agent Robin Successfully Integrated 550 Research Papers, Establishing an Autonomous Research Loop and Identifying dAMD Candidate therapies.

Leveraging Gemini 1.5's Long Contextual Capabilities, Google's Conversational Healthcare System AMIE Achieved the Reasoning Level of a General Practitioner in 100 Scenarios Involving Multiple Patient visits.

Related News

Fast and Accurate! Cohere Releases open-source Transcription Model; Accurate Parsing of Complex Scenarios: Chandra-ocr-2 Visual Language Model Achieves Precise OCR.

AI-driven De Novo Design of Diverse small-molecule Binding Proteins: A South Korean Team Discovered a Protein That Can Selectively Recognize Stress hormones.

Dataset Compilation | From Medical imaging/clinical Data to Cell atlas/medical Q&A, 10 Major Datasets Covering Multiple Disease Scenarios

Paper Weekly Report | ProgramBench Enables AI to Write Software From Scratch, With 9 Major Models Failing En Masse; ExoActor Demonstrates Strong Scene Generalization Ability Without Additional real-world Data… A Quick Overview of the week's cutting-edge AI Papers

Online Tutorial | HKU Team Open Sources DeepTutor, a Personal Learning Assistant That Enables Interactive Learning Covering Understanding, Reasoning, and Generation Through Multi-Agent Collaboration

4-step Image output/4K quality/6x Speedup, PiD Uses Pixel Diffusion to Unify Decoding and super-resolution Output; SA-3DAO: a Dataset Containing 1000 Pairs of Real Images Paired With Handcrafted 3D Meshes by artists.

Paper Weekly Report | Microsoft MAI-Thinking Explores self-evolution of Pure RL, Achieving an AIME Accuracy of 97%; VLM³ Achieves 3D Task Generalization Using Plain Text Coordinates Without Architectural Modifications… A Quick Overview of the week's cutting-edge AI Papers

In Just 30 Minutes, the Biological multi-agent Robin Successfully Integrated 550 Research Papers, Establishing an Autonomous Research Loop and Identifying dAMD Candidate therapies.

Leveraging Gemini 1.5's Long Contextual Capabilities, Google's Conversational Healthcare System AMIE Achieved the Reasoning Level of a General Practitioner in 100 Scenarios Involving Multiple Patient visits.

Related News

Fast and Accurate! Cohere Releases open-source Transcription Model; Accurate Parsing of Complex Scenarios: Chandra-ocr-2 Visual Language Model Achieves Precise OCR.

AI-driven De Novo Design of Diverse small-molecule Binding Proteins: A South Korean Team Discovered a Protein That Can Selectively Recognize Stress hormones.

Dataset Compilation | From Medical imaging/clinical Data to Cell atlas/medical Q&A, 10 Major Datasets Covering Multiple Disease Scenarios

Paper Weekly Report | ProgramBench Enables AI to Write Software From Scratch, With 9 Major Models Failing En Masse; ExoActor Demonstrates Strong Scene Generalization Ability Without Additional real-world Data… A Quick Overview of the week's cutting-edge AI Papers

Online Tutorial | HKU Team Open Sources DeepTutor, a Personal Learning Assistant That Enables Interactive Learning Covering Understanding, Reasoning, and Generation Through Multi-Agent Collaboration

4-step Image output/4K quality/6x Speedup, PiD Uses Pixel Diffusion to Unify Decoding and super-resolution Output; SA-3DAO: a Dataset Containing 1000 Pairs of Real Images Paired With Handcrafted 3D Meshes by artists.

Paper Weekly Report | Microsoft MAI-Thinking Explores self-evolution of Pure RL, Achieving an AIME Accuracy of 97%; VLM³ Achieves 3D Task Generalization Using Plain Text Coordinates Without Architectural Modifications… A Quick Overview of the week's cutting-edge AI Papers

In Just 30 Minutes, the Biological multi-agent Robin Successfully Integrated 550 Research Papers, Establishing an Autonomous Research Loop and Identifying dAMD Candidate therapies.

Leveraging Gemini 1.5's Long Contextual Capabilities, Google's Conversational Healthcare System AMIE Achieved the Reasoning Level of a General Practitioner in 100 Scenarios Involving Multiple Patient visits.