Countdown 3 Days! Book an Appointment for Apple WWDC24 Live Broadcast Now; RLAIF-V Large-scale Multimodal Preference Dataset Is Online, Effectively Reducing the Hallucination Phenomenon of Different MLLMs

From June 3rd to June 7th, hyper.ai official website updates:
High-quality public datasets: 10
High-quality tutorial selection: 2
Community Article Selection: 3 articles
Popular encyclopedia entries: 5
Top conferences with deadlines in June and July: 5
Visit the official website:hyper.ai
Selected public datasets
1. ChartQA Chart Question Benchmark Dataset
The dataset covers 9.6K human-written questions and 23.1K questions generated from human-written diagram summaries, and is designed to solve complex problems involving visual and logical reasoning.
Direct use:https://go.hyper.ai/5tJE9
2. RS5M Large-scale Image-Text Pairing Remote Sensing Dataset
The RS5M dataset contains 5 million remote sensing images with English descriptions. This dataset is obtained by screening publicly available image-text pairing datasets and labeled remote sensing (RS) datasets using a pre-trained visual language model (VLM).
Direct use:https://go.hyper.ai/jbwsV
3. CapsFusion-120M Multimodal Image and Text Dataset
This dataset contains image and text information from the LAION-2B and LAION-COCO datasets, which can be used for large-scale multimodal pre-training or to further study the quality of image and text data.
Direct use:https://go.hyper.ai/pEE7u
4. ShareGPT4V Large-scale High-quality Image and Text Dataset
The dataset contains 1.2 million image-text pairs that effectively align visual and language features, enhance the model's ability to follow instructions, and incorporate more academic tasks such as ScienceQA, TextVQA, SBU, etc.
Direct use:https://go.hyper.ai/9CVao
5. RLAIF-V-Dataset Large-scale Multimodal Preference Dataset
The RLAIF-V dataset is an AI-generated multimodal preference dataset that covers a variety of tasks and domains. The dataset contains more than 44,757 sets of high-quality comparison pairs for training and evaluating multimodal large language models.
Direct use:https://go.hyper.ai/cG6fp
6. FoodLogoDet-1500 High-quality food logo detection dataset
The dataset consists of 1,500 categories, 99,768 images, and 145,400 objects. This is the first and largest publicly available food logo detection dataset.
Direct use:https://go.hyper.ai/eco23
The dataset contains 20,603 food images collected from 10 restaurant scenes, each of which has multiple food objects annotated with bounding boxes, consisting of 95,322 bounding boxes and 291 classes.
Direct use:https://go.hyper.ai/6xrrC
The dataset contains more than 1,000 fine-grained food categories and more than 500,000 images, and was used by ICCV 2021 for the Workshop LargeFineFoodAI large-scale fine-grained food analysis competition.
Direct use:https://go.hyper.ai/sjZJi
9. ISIA Ingredient-201 Food Image Dataset
There are 201 subcategories in this dataset, covering common types of existing food categories. Food images are collected in 5 food-related scenes, and at least 150 food categories are collected in each scene.
Direct use:https://go.hyper.ai/bGe45
10. ISIA Food-500 Food Dishes Dataset
The dataset contains 399,726 food items, including more than 500 dishes. Each item includes the food name and food image.
Direct use:https://go.hyper.ai/yqco5
For more public datasets, please visit:
Selected Public Tutorials
The DynamiCrafter model launched by the Chinese University of Hong Kong, Tencent AI Lab, etc. uses video diffusion technology to simulate real-world motion patterns. Combined with text instructions, images can be converted into dynamic videos. This tutorial has built a ComfyUI workflow environment for everyone. Don't worry about node connection errors. Just upload the picture and enter the text to operate!
Run online:https://go.hyper.ai/PWzJR
2. Don’t wait! Come and experience GLM-4-9B-Chat Demo
This week, Zhipu AI released the latest open source achievement of the large base model GLM-4 - GLM-4-9B, which has multimodal capabilities for the first time. In order to let everyone experience this open source model that claims to "surpass Llama3-8B" as soon as possible, Chao Neuro launched the "GLM-4-9B-Chat Demo" tutorial. You can start experiencing the excellent performance of GLM-4-9B-Chat immediately without entering any commands and clicking Clone.
Run online:https://go.hyper.ai/hc5OK
Community Articles
The research group of Hong Liang from Shanghai Jiao Tong University proposed the PROTLGN microenvironment-aware graph neural network, which can learn and predict beneficial amino acid mutation sites from the three-dimensional structure of proteins, guide the design of single-site mutations and multi-site mutations of proteins with different functions, and the PROTLGN designed single-site mutant proteins with more than 40% are superior to their wild-type counterparts. The relevant results have been published in "JCM".
View the full report:https://go.hyper.ai/6FkFu
Kang Jianqiang's team from Wuhan University of Technology proposed a simplified electrochemical model of ensemble learning (ELM) + FIE. ELM accurately predicts the lithium ion concentration of the solid electrode, achieves more accurate voltage prediction than a single model, and its computational complexity is much lower than the P2D model. FIE accurately predicts the lithium ion concentration in the electrolyte near the positive and negative current collectors.
View the full report:https://go.hyper.ai/CWvce
Professor Mei Yongfeng's research group at the Department of Materials Science at Fudan University proposed a multi-level quasi-static finite element analysis method and designed and constructed six types of silicon/chromium nanofilm assembled three-dimensional microstructures and corresponding three-dimensional light detectors, verifying the good versatility and industrial practicability of the technology. The relevant results have been published in "Nature".
View the full report:https://go.hyper.ai/2s73Q
Popular Encyclopedia Articles
1. Nuclear Norm
2. Masked Language Modeling (MLM)
3. Long and short-term memory Long Short-Term Memory
4. YOLOv10 Real-time End-to-End Object Detection
5. Kolmogorov-Arnold Networks
Here are hundreds of AI-related terms compiled to help you understand "artificial intelligence" here:
Station B live broadcast preview
Apple will hold WWDC24 on June 11 (next Tuesday) Beijing time. HyperAI Super Neural Video Account and Bilibili will broadcast it in real time. Please scan the QR code to make an appointment for the live broadcast↓

In order to help you gain a deeper understanding of Apple's relevant information,The Super Neuro B Station live broadcast room will continue to broadcast the "Apple Special" video.Involves: Past WWDC conferences, executive interviews, related documentaries and other rich content.
The following table is a preview of the content selected by the editor↓↓↓
date | time | content |
Monday, June 10 | 18:00 | Steve Jobs |
Tuesday, June 11 | 1:00 | Apple WWDC24 |
Wednesday, June 12 | 18:00 | What makes Apple |
Thursday, June 13 | 18:00 | iPhone first release |
Friday, June 14 | 18:00 | History of Steve Jobs |
Saturday, June 15 | 18:00 | How Apple survived nearly bankruptcy |
Sunday, June 16 | 18:00 | Tim Cook's History |
Super Neuro TV broadcasts live 24/7. Click to get the "electronic pickles" in the AI field:
http://live.bilibili.com/26483094
Deadline for the conference is June-July

One-stop tracking of top AI academic conferences:https://hyper.ai/events
The above is all the content of this week’s editor’s selection. If you have resources that you want to include on the hyper.ai official website, you are also welcome to leave a message or submit an article to tell us!
See you next week!
About HyperAI
HyperAI (hyper.ai) is the leading artificial intelligence and high-performance computing community in China.We are committed to becoming the infrastructure in the field of data science in China and providing rich and high-quality public resources for domestic developers. So far, we have:
* Provide domestic accelerated download nodes for 1200+ public data sets
* Includes 300+ classic and popular online tutorials
* Interpretation of 100+ AI4Science paper cases
* Support 500+ related terms search
* Hosting the first complete Apache TVM Chinese documentation in China
Visit the official website to start your learning journey: