Meta's First Multimodal Large Model Is Launched With One Click! The First multi-needle Embroidery Dataset Is Online, Containing More Than 30k Images

In his keynote speech at Meta Connect 2024, Zuckerberg announced the launch of the first multimodal large model, Llama 3.2 vision! The model has two versions, 11B and 90B, becoming the first Llama series models to support multimodal tasks. According to official data, the performance of these two open-source models has surpassed the closed-source models.

I can't wait to deploy it.We also immediately launched a one-click deployment tutorial for the 11B model on the hyper.ai official website.Welcome everyone to experience reasoning together~

Run online:https://go.hyper.ai/DKGzm

From September 23rd to September 27th, hyper.ai official website updates:

* High-quality public datasets: 10

* Selected high-quality tutorials: 2

* Community article selection: 3 articles

* Popular encyclopedia entries: 5

* Top conferences with deadline in October: 7

Visit the official website:hyper.ai

Selected public datasets

1. MSEmbGAN multi-needle embroidery dataset

This dataset is the first embroidery dataset with detailed annotations of single-stitch and multi-stitch labels. More than 30K images, including embroidery images and corresponding content images, were produced by professional embroidery software (Wilcom 9.0). This paper will contribute the constructed multi-stitch embroidery dataset to other researchers in this research field.

Direct use:https://go.hyper.ai/urNGE

2. The Movies Dataset

This dataset contains detailed metadata for 45,000 movies in the full MovieLens dataset, all of which were released before July 2017. This dataset not only covers basic information about the movies, such as posters, backgrounds, budgets, and revenues, but also includes details such as release date, language, country of production, and company.

Direct use:https://go.hyper.ai/SDwXX

3. Open X-Embodiment Real Robot Dataset

The dataset brings together data from 22 different robot types, from single-arm robots to two-handed robots and quadruped robots, collected by 21 different institutions, covering 527 different skills and 160,266 tasks. It was built by aggregating 60 existing robotics datasets from 34 robotics research laboratories around the world, showing a variety of robotic tasks and environments.

Direct use:https://go.hyper.ai/Cqlw6

4. TMDB 5k Movie Dataset Movie Information Dataset

This dataset contains detailed information about 5k movies, mainly from the United States over the past 100 years (1916-2017). The dataset is designed to help researchers and analysts explore popular trends and investment directions in the film industry, as well as provide reference suggestions for new entrants in the film industry.

Direct use:https://go.hyper.ai/zaRFY

5. LongCite-45k Large Model Fine-Grained Improvement Dataset

The dataset contains 44,600 high-quality question-answering data with sentence-level citations, supports long text processing with a maximum length of 128k tokens, and enables users to verify the accuracy of the model's answers by generating fine-grained sentence-level citations.

Direct use:https://go.hyper.ai/omO5f

6. Full TMDB Movies Dataset 2024 Movie Dataset

TMDb (The Movie Database) is a comprehensive movie database. This dataset contains a collection of 1,000k movies from the TMDB database, providing information about the movies including details such as title, rating, release date, revenue, genre, etc.

Direct use:https://go.hyper.ai/r9ks2

7. InfiMM-WebMath-40B Multimodal Mathematical Reasoning Dataset

This dataset is a large open source multimodal dataset designed specifically for mathematical reasoning tasks, containing 2.4k web pages, 8.5k related image URLs, and 40 billion tokens, all of which have been carefully extracted and filtered from the CommonCrawl database (2019-2023).

Direct use:https://go.hyper.ai/P8m9l

8. VoiceAssistant-400K Voice Assistant Optimization Dataset

VoiceAssistant-400K is a dataset optimized specifically for voice assistants. It aims to help the model reduce the generation of code symbols when providing voice assistant services and enhance the practicality of the model in real applications.

Direct use:https://go.hyper.ai/KGIM0

9. Top 5k Albums of All Time Music Album Review Dataset

This dataset contains http://rateyourmusic.com The top 5k albums of all time as voted by community users. This dataset was crawled on October 12, 2021 and includes attributes such as rank, album name, artist name, release date, genre, description, average rating, number of ratings, and number of reviews.

Direct use:https://go.hyper.ai/c4Olt

10. Spotify daily top 200 songs music song trend dataset

This dataset contains the Spotify Top 200 song lists worldwide every day from 2017 to 2021. This dataset covers more than 350k songs, providing researchers and music lovers with rich information for analyzing popular trends, music preferences, and other related research.

Direct use:https://go.hyper.ai/afvbK

For more public datasets, please visit:

https://hyper.ai/datasets

Selected Public Tutorials

1. One-click deployment of Llama-3.2-11B-Vision-Instruct

This model is a 11B parameter size of the Llama 3.2-Vision multimodal large model series, supports high-resolution image input (1120×1120 pixels), and uses a cross-attention mechanism with the base model to complete and command-adjusted chat variants. Enter the official website to clone and start the container, and directly copy the API address to experience the model inference.

Direct use:https://go.hyper.ai/DKGzm

2. ComfyUl Littletinies fairy tale illustration generation demo

The model is able to generate hand-drawn cartoon-style images based on text prompts. This model is particularly suitable for creating whimsical and stylized illustrations with classic cartoon aesthetics. The generated images have hand-drawn textures, smooth strokes, and soft colors. The model and environment have been deployed, and you can perform inference generation according to the tutorial instructions.

Direct use:https://go.hyper.ai/YHu0a

We have also established a Stable Diffusion tutorial exchange group. Welcome friends to scan the QR code and remark [SD tutorial] to join the group to discuss various technical issues and share application results~

Community Articles

1. Intelligently generate embroidery patterns! The Visual Computing and Digital Textiles Team of Wuhan Textile University released the first multi-stitch embroidery generative adversarial network model, which was accepted by the top journal TVCG

Hu Xinrong's research group from the School of Computer Science and Artificial Intelligence at Wuhan Textile University proposed a multi-stitch embroidery generative adversarial network model MSEmbGAN and created the largest embroidery dataset available. The related paper was also accepted by the top journal TVCG. This article is a detailed interpretation and sharing of the paper.

View the full summary:https://go.hyper.ai/5t8NQ

2. New results in the authoritative journal Cell Discovery! The team of Hong Liang from Shanghai Jiaotong University proposed the CPDiffusion model, which can design functional proteins automatically at ultra-low cost.

The Hong Liang team from Shanghai Jiao Tong University designed a diffusion probability model framework that can learn the implicit mapping relationship between protein sequence, structure and function at very low training and data costs, thereby generating diverse protein sequences. This article is a detailed interpretation and sharing of the paper.

View the full report:https://go.hyper.ai/ziRvz

3. Selected for ECCV 2024! Covering 54,000+ images, MIT proposed a general model for medical image segmentation, ScribblePrompt, which performs better than SAM

A team from the MIT Computer Science and Artificial Intelligence Laboratory, in collaboration with researchers from Massachusetts General Hospital and Harvard Medical School, proposed a general model for interactive biomedical image segmentation called ScribblePrompt, a neural network-based segmentation tool that allows annotators to use different annotation methods such as graffiti, clicks, and bounding boxes to flexibly perform biomedical image segmentation tasks, even for untrained labels and image types. This article is a detailed interpretation and sharing of the paper.

View the full report:https://go.hyper.ai/QQjAf

Popular Encyclopedia Articles

1. Sigmoid function

2. Paired t-Test

3. Contrastive Learning

4. Semi-Supervised Learning

5. Data Augmentation

Here are hundreds of AI-related terms compiled to help you understand "artificial intelligence" here:

https://go.hyper.ai/wiki

One-stop tracking of top AI academic conferences:https://go.hyper.ai/event

The above is all the content of this week’s editor’s selection. If you have resources that you want to include on the hyper.ai official website, you are also welcome to leave a message or submit an article to tell us!

See you next week!

About HyperAI

HyperAI (hyper.ai) is the leading artificial intelligence and high-performance computing community in China.We are committed to becoming the infrastructure in the field of data science in China and providing rich and high-quality public resources for domestic developers. So far, we have:

* Provide domestic accelerated download nodes for 1300+ public data sets

* Includes 400+ classic and popular online tutorials

* Interpretation of 100+ AI4Science paper cases

* Support 500+ related terms search

* Hosting the first complete Apache TVM Chinese documentation in China

Visit the official website to start your learning journey:

https://hyper.ai

HyperAI

Meta's First Multimodal Large Model Is Launched With One Click! The First multi-needle Embroidery Dataset Is Online, Containing More Than 30k Images

2 years ago

Information

Artificial Intelligence

Dataset

Deep Learning

Natural Language Processing

I can't wait to deploy it.We also immediately launched a one-click deployment tutorial for the 11B model on the hyper.ai official website.Welcome everyone to experience reasoning together~

Run online:https://go.hyper.ai/DKGzm

From September 23rd to September 27th, hyper.ai official website updates:

* High-quality public datasets: 10

* Selected high-quality tutorials: 2

* Community article selection: 3 articles

* Popular encyclopedia entries: 5

* Top conferences with deadline in October: 7

Visit the official website:hyper.ai

Selected public datasets

1. MSEmbGAN multi-needle embroidery dataset

Direct use:https://go.hyper.ai/urNGE

2. The Movies Dataset

Direct use:https://go.hyper.ai/SDwXX

3. Open X-Embodiment Real Robot Dataset

Direct use:https://go.hyper.ai/Cqlw6

4. TMDB 5k Movie Dataset Movie Information Dataset

Direct use:https://go.hyper.ai/zaRFY

5. LongCite-45k Large Model Fine-Grained Improvement Dataset

Direct use:https://go.hyper.ai/omO5f

6. Full TMDB Movies Dataset 2024 Movie Dataset

Direct use:https://go.hyper.ai/r9ks2

7. InfiMM-WebMath-40B Multimodal Mathematical Reasoning Dataset

Direct use:https://go.hyper.ai/P8m9l

8. VoiceAssistant-400K Voice Assistant Optimization Dataset

Direct use:https://go.hyper.ai/KGIM0

9. Top 5k Albums of All Time Music Album Review Dataset

Direct use:https://go.hyper.ai/c4Olt

10. Spotify daily top 200 songs music song trend dataset

Direct use:https://go.hyper.ai/afvbK

For more public datasets, please visit:

https://hyper.ai/datasets

Selected Public Tutorials

1. One-click deployment of Llama-3.2-11B-Vision-Instruct

Direct use:https://go.hyper.ai/DKGzm

2. ComfyUl Littletinies fairy tale illustration generation demo

Direct use:https://go.hyper.ai/YHu0a

Community Articles

View the full summary:https://go.hyper.ai/5t8NQ

View the full report:https://go.hyper.ai/ziRvz

3. Selected for ECCV 2024! Covering 54,000+ images, MIT proposed a general model for medical image segmentation, ScribblePrompt, which performs better than SAM

View the full report:https://go.hyper.ai/QQjAf

Popular Encyclopedia Articles

1. Sigmoid function

2. Paired t-Test

3. Contrastive Learning

4. Semi-Supervised Learning

5. Data Augmentation

Here are hundreds of AI-related terms compiled to help you understand "artificial intelligence" here:

https://go.hyper.ai/wiki

One-stop tracking of top AI academic conferences:https://go.hyper.ai/event

See you next week!

About HyperAI

* Provide domestic accelerated download nodes for 1300+ public data sets

* Includes 400+ classic and popular online tutorials

* Interpretation of 100+ AI4Science paper cases

* Support 500+ related terms search

* Hosting the first complete Apache TVM Chinese documentation in China

Visit the official website to start your learning journey:

https://hyper.ai

Meta's First Multimodal Large Model Is Launched With One Click! The First multi-needle Embroidery Dataset Is Online, Containing More Than 30k Images

2 years ago

Information

Artificial Intelligence

Dataset

Deep Learning

Natural Language Processing

I can't wait to deploy it.We also immediately launched a one-click deployment tutorial for the 11B model on the hyper.ai official website.Welcome everyone to experience reasoning together~

Run online:https://go.hyper.ai/DKGzm

From September 23rd to September 27th, hyper.ai official website updates:

* High-quality public datasets: 10

* Selected high-quality tutorials: 2

* Community article selection: 3 articles

* Popular encyclopedia entries: 5

* Top conferences with deadline in October: 7

Visit the official website:hyper.ai

Selected public datasets

1. MSEmbGAN multi-needle embroidery dataset

Direct use:https://go.hyper.ai/urNGE

2. The Movies Dataset

Direct use:https://go.hyper.ai/SDwXX

3. Open X-Embodiment Real Robot Dataset

Direct use:https://go.hyper.ai/Cqlw6

4. TMDB 5k Movie Dataset Movie Information Dataset

Direct use:https://go.hyper.ai/zaRFY

5. LongCite-45k Large Model Fine-Grained Improvement Dataset

Direct use:https://go.hyper.ai/omO5f

6. Full TMDB Movies Dataset 2024 Movie Dataset

Direct use:https://go.hyper.ai/r9ks2

7. InfiMM-WebMath-40B Multimodal Mathematical Reasoning Dataset

Direct use:https://go.hyper.ai/P8m9l

8. VoiceAssistant-400K Voice Assistant Optimization Dataset

Direct use:https://go.hyper.ai/KGIM0

9. Top 5k Albums of All Time Music Album Review Dataset

Direct use:https://go.hyper.ai/c4Olt

10. Spotify daily top 200 songs music song trend dataset

Direct use:https://go.hyper.ai/afvbK

For more public datasets, please visit:

https://hyper.ai/datasets

Selected Public Tutorials

1. One-click deployment of Llama-3.2-11B-Vision-Instruct

Direct use:https://go.hyper.ai/DKGzm

2. ComfyUl Littletinies fairy tale illustration generation demo

Direct use:https://go.hyper.ai/YHu0a

Community Articles

View the full summary:https://go.hyper.ai/5t8NQ

View the full report:https://go.hyper.ai/ziRvz

3. Selected for ECCV 2024! Covering 54,000+ images, MIT proposed a general model for medical image segmentation, ScribblePrompt, which performs better than SAM

View the full report:https://go.hyper.ai/QQjAf

Popular Encyclopedia Articles

1. Sigmoid function

2. Paired t-Test

3. Contrastive Learning

4. Semi-Supervised Learning

5. Data Augmentation

Here are hundreds of AI-related terms compiled to help you understand "artificial intelligence" here:

https://go.hyper.ai/wiki

One-stop tracking of top AI academic conferences:https://go.hyper.ai/event

See you next week!

About HyperAI

* Provide domestic accelerated download nodes for 1300+ public data sets

* Includes 400+ classic and popular online tutorials

* Interpretation of 100+ AI4Science paper cases

* Support 500+ related terms search

* Hosting the first complete Apache TVM Chinese documentation in China

Visit the official website to start your learning journey:

https://hyper.ai

Command Palette

Meta's First Multimodal Large Model Is Launched With One Click! The First multi-needle Embroidery Dataset Is Online, Containing More Than 30k Images

Command Palette

Meta's First Multimodal Large Model Is Launched With One Click! The First multi-needle Embroidery Dataset Is Online, Containing More Than 30k Images

Related News

Achieve "voice-over Freedom" With Just 3 Seconds of Audio: Mistral open-source Speech Model Voxtral-4B-TTS-2603; Set a New Benchmark for Data Quality: Sutra 10B Pretraining.

Fast and Accurate! Cohere Releases open-source Transcription Model; Accurate Parsing of Complex Scenarios: Chandra-ocr-2 Visual Language Model Achieves Precise OCR.

4-step Image output/4K quality/6x Speedup, PiD Uses Pixel Diffusion to Unify Decoding and super-resolution Output; SA-3DAO: a Dataset Containing 1000 Pairs of Real Images Paired With Handcrafted 3D Meshes by artists.

Zero-sampling TTS Breakthrough! A Few Seconds of Reference Audio, OmniVoice Helps You Easily Clone Hundreds of Languages; 17 Languages All in One Go: MDPbench Solves the Major Problem of Parsing low-resource Text systems.

MIT/IBM Has Released ChartNet, the Largest Synthetic Chart Dataset to Date, Generating 1.5 Million Diverse Chart samples.

Paper Weekly Report | Microsoft MAI-Thinking Explores self-evolution of Pure RL, Achieving an AIME Accuracy of 97%; VLM³ Achieves 3D Task Generalization Using Plain Text Coordinates Without Architectural Modifications… A Quick Overview of the week's cutting-edge AI Papers

Tencent open-sources Hy-MT1.5 Translation Model: 440MB Achieves top-tier Translation Capabilities; MIT Jointly Releases MathNet: a Multimodal Mathematical Inference Benchmark Covering 27,000 Real Olympiad Math problems.

Can Emojis Control Speech Generation? Irodori-TTS Is a Japanese TTS Based on the RF-DiT Architecture; Eczema and Tinea Skin Disease Datasets: Supporting Medical Image Classification and Transfer learning.

MiniCPM5-1B, Trained Using RL+OPD, Achieves state-of-the-art (SOTA) Performance on Multiple Complex Tasks; the CHI-Bench Dataset for Evaluating Medical Agents, Designed for Automation of Complex Healthcare Processes, Has Been released.

Command Palette

Meta's First Multimodal Large Model Is Launched With One Click! The First multi-needle Embroidery Dataset Is Online, Containing More Than 30k Images

Related News

Achieve "voice-over Freedom" With Just 3 Seconds of Audio: Mistral open-source Speech Model Voxtral-4B-TTS-2603; Set a New Benchmark for Data Quality: Sutra 10B Pretraining.

Fast and Accurate! Cohere Releases open-source Transcription Model; Accurate Parsing of Complex Scenarios: Chandra-ocr-2 Visual Language Model Achieves Precise OCR.

4-step Image output/4K quality/6x Speedup, PiD Uses Pixel Diffusion to Unify Decoding and super-resolution Output; SA-3DAO: a Dataset Containing 1000 Pairs of Real Images Paired With Handcrafted 3D Meshes by artists.

Zero-sampling TTS Breakthrough! A Few Seconds of Reference Audio, OmniVoice Helps You Easily Clone Hundreds of Languages; 17 Languages All in One Go: MDPbench Solves the Major Problem of Parsing low-resource Text systems.

MIT/IBM Has Released ChartNet, the Largest Synthetic Chart Dataset to Date, Generating 1.5 Million Diverse Chart samples.

Paper Weekly Report | Microsoft MAI-Thinking Explores self-evolution of Pure RL, Achieving an AIME Accuracy of 97%; VLM³ Achieves 3D Task Generalization Using Plain Text Coordinates Without Architectural Modifications… A Quick Overview of the week's cutting-edge AI Papers

Tencent open-sources Hy-MT1.5 Translation Model: 440MB Achieves top-tier Translation Capabilities; MIT Jointly Releases MathNet: a Multimodal Mathematical Inference Benchmark Covering 27,000 Real Olympiad Math problems.

Can Emojis Control Speech Generation? Irodori-TTS Is a Japanese TTS Based on the RF-DiT Architecture; Eczema and Tinea Skin Disease Datasets: Supporting Medical Image Classification and Transfer learning.

MiniCPM5-1B, Trained Using RL+OPD, Achieves state-of-the-art (SOTA) Performance on Multiple Complex Tasks; the CHI-Bench Dataset for Evaluating Medical Agents, Designed for Automation of Complex Healthcare Processes, Has Been released.

Related News

Achieve "voice-over Freedom" With Just 3 Seconds of Audio: Mistral open-source Speech Model Voxtral-4B-TTS-2603; Set a New Benchmark for Data Quality: Sutra 10B Pretraining.

Fast and Accurate! Cohere Releases open-source Transcription Model; Accurate Parsing of Complex Scenarios: Chandra-ocr-2 Visual Language Model Achieves Precise OCR.

4-step Image output/4K quality/6x Speedup, PiD Uses Pixel Diffusion to Unify Decoding and super-resolution Output; SA-3DAO: a Dataset Containing 1000 Pairs of Real Images Paired With Handcrafted 3D Meshes by artists.

Zero-sampling TTS Breakthrough! A Few Seconds of Reference Audio, OmniVoice Helps You Easily Clone Hundreds of Languages; 17 Languages All in One Go: MDPbench Solves the Major Problem of Parsing low-resource Text systems.

MIT/IBM Has Released ChartNet, the Largest Synthetic Chart Dataset to Date, Generating 1.5 Million Diverse Chart samples.

Paper Weekly Report | Microsoft MAI-Thinking Explores self-evolution of Pure RL, Achieving an AIME Accuracy of 97%; VLM³ Achieves 3D Task Generalization Using Plain Text Coordinates Without Architectural Modifications… A Quick Overview of the week's cutting-edge AI Papers

Tencent open-sources Hy-MT1.5 Translation Model: 440MB Achieves top-tier Translation Capabilities; MIT Jointly Releases MathNet: a Multimodal Mathematical Inference Benchmark Covering 27,000 Real Olympiad Math problems.

Can Emojis Control Speech Generation? Irodori-TTS Is a Japanese TTS Based on the RF-DiT Architecture; Eczema and Tinea Skin Disease Datasets: Supporting Medical Image Classification and Transfer learning.

MiniCPM5-1B, Trained Using RL+OPD, Achieves state-of-the-art (SOTA) Performance on Multiple Complex Tasks; the CHI-Bench Dataset for Evaluating Medical Agents, Designed for Automation of Complex Healthcare Processes, Has Been released.

Related News

Achieve "voice-over Freedom" With Just 3 Seconds of Audio: Mistral open-source Speech Model Voxtral-4B-TTS-2603; Set a New Benchmark for Data Quality: Sutra 10B Pretraining.

Fast and Accurate! Cohere Releases open-source Transcription Model; Accurate Parsing of Complex Scenarios: Chandra-ocr-2 Visual Language Model Achieves Precise OCR.

4-step Image output/4K quality/6x Speedup, PiD Uses Pixel Diffusion to Unify Decoding and super-resolution Output; SA-3DAO: a Dataset Containing 1000 Pairs of Real Images Paired With Handcrafted 3D Meshes by artists.

Zero-sampling TTS Breakthrough! A Few Seconds of Reference Audio, OmniVoice Helps You Easily Clone Hundreds of Languages; 17 Languages All in One Go: MDPbench Solves the Major Problem of Parsing low-resource Text systems.

MIT/IBM Has Released ChartNet, the Largest Synthetic Chart Dataset to Date, Generating 1.5 Million Diverse Chart samples.

Paper Weekly Report | Microsoft MAI-Thinking Explores self-evolution of Pure RL, Achieving an AIME Accuracy of 97%; VLM³ Achieves 3D Task Generalization Using Plain Text Coordinates Without Architectural Modifications… A Quick Overview of the week's cutting-edge AI Papers

Tencent open-sources Hy-MT1.5 Translation Model: 440MB Achieves top-tier Translation Capabilities; MIT Jointly Releases MathNet: a Multimodal Mathematical Inference Benchmark Covering 27,000 Real Olympiad Math problems.

Can Emojis Control Speech Generation? Irodori-TTS Is a Japanese TTS Based on the RF-DiT Architecture; Eczema and Tinea Skin Disease Datasets: Supporting Medical Image Classification and Transfer learning.

MiniCPM5-1B, Trained Using RL+OPD, Achieves state-of-the-art (SOTA) Performance on Multiple Complex Tasks; the CHI-Bench Dataset for Evaluating Medical Agents, Designed for Automation of Complex Healthcare Processes, Has Been released.