One-click Deployment of Phi 3.5 mini+vision! Multimodal Reading Benchmark Dataset MRR-Benchmark Is Online, Including 550 question-answer Pairs

‍‍The small model is back again! Microsoft has released three open source models in a row! In one go, it released three models of Phi 3.5 for different tasks, and surpassed other similar models on multiple benchmarks.

Among them, Phi-3.5-mini-instruct is specially launched for devices with limited memory or computing power. It can also demonstrate powerful reasoning capabilities with small parameters, and can easily handle tasks such as code generation and multi-language understanding. Phi-3.5-vision-instruct is a leader in the multimodal field, which can process text and visual information at the same time, and can easily handle tasks such as image understanding and video summarization.

HyperAI Super Neural Network has now launched the model deployment tutorials for the mini version and vision version. Scroll down to get the link~

From September 2nd to September 6th, hyper.ai official website updates:

* Selection of high-quality tutorials: 3

* High-quality public datasets: 10

* Community article selection: 3 articles

* Popular encyclopedia entries: 5

* Top conferences with deadline in September: 5

Visit the official website:hyper.ai

I would like to recommend an online academic sharing activity to you.Zhou Ziyi, a postdoctoral fellow at Shanghai Jiao Tong University, will present a talk titled "Small Sample Learning Methods for Protein Language Models".Bring practical information sharing, click to make an appointment to watch⬇️

https://hdxu.cn/6Bjom

Selected Public Tutorials

1. One-click deployment of Phi-3.5-mini-instruct

Phi-3.5-mini-instruct supports a context length of 128K tokens, which is suitable for tasks such as code generation, math problem solving, and logic-based reasoning. The model performs well in multi-language and multi-round dialogue tasks, and surpasses other models of the same level in the RepoQA benchmark. This tutorial is a one-click deployment demo of the model. You only need to clone and start the container, and directly copy the generated API address to experience the reasoning of the model.

Direct use:https://go.hyper.ai/F7smR

2. One-click deployment of Phi-3.5-vision-instruct

The Phi-3.5-vision-instruct model has extensive image understanding, optical character recognition (OCR), chart and table parsing, and multi-image or video clip summarization capabilities, making it very suitable for a variety of AI-driven applications. It shows significant performance improvements in benchmarks related to image and video processing. The model and environment have been deployed, and you can directly use the large model for inference generation according to the tutorial instructions.

Direct use:https://go.hyper.ai/zN9Bx

3. Online Tutorial | Generate a 10,000-word suspense novel in 1 minute, LongWriter-glm4-9b breaks through the bottleneck of long text output

LongWriter is an open source project developed by Tsinghua University that uses a long-context large language model (LLM) to generate very long text (more than 10,000 words). This tutorial is a one-click deployment demo of the model. You only need to clone and start the container and directly copy the generated API address to experience the model inference.

Direct use:https://go.hyper.ai/p6SiO

Selected public datasets

1. MRR-Benchmark Multimodal Reading Benchmark Dataset

The Multimodal Reading (MMR) benchmark includes 550 annotated question-answer pairs in 11 different tasks covering text, fonts, visual elements, bounding boxes, spatial relations, and ground truth with well-designed evaluation metrics.

Direct use:https://go.hyper.ai/deAmf

2. EveDentify pupil diameter estimation dataset

The dataset contains 212,073 images of 51 participants. The research team used a Tobii eye tracker to collect accurate pupil diameter measurement data while using a built-in webcam to capture facial videos. The dataset aims to address the lack of available datasets when using ordinary webcam images for pupil diameter estimation.

Direct use:https://go.hyper.ai/iHjxC

3. Traffic Road Object Detection Polish traffic road object detection dataset

This dataset contains 11k annotated images of Polish roads, specially curated for the object detection task. The data was collected using car-mounted cameras on Polish roads, mainly in Krakow. The images capture a variety of scenes, including different road types and various lighting conditions (day and night).

Direct use:https://go.hyper.ai/Sl0k5

4. C2A Human Detection Dataset in Disaster Scenarios

The C2A (combined to application) dataset contains a total of 10,215 high-resolution images of 4 disaster scene types (fire/smoke, flood, collapsed building/rubble, and traffic accident) and 5 human posture categories (bending, kneeling, lying down, sitting, and standing upright), with image resolutions ranging from 123×152 to 5184×3456 pixels, and more than 360,000 annotated human instances.

Direct use:https://go.hyper.ai/15dMR

5. Skin Conditionsmage Dataset 6 skin condition datasets

This dataset contains enhanced images of 6 different skin diseases: acne, cancer, eczema, keratosis, milia, and rosacea. Each category contains 399 images, for a total of 2,394 images.

Direct use:https://go.hyper.ai/tWO7x

6. Penn-Fudan Pedestrian Detection and Segmentation Dataset

This dataset contains 170 high-resolution RGB images captured from video sequences, and each image contains 0 to 6 pedestrian targets. The position of each pedestrian is accurately marked with a rectangular box (mask), providing bounding box coordinate information for easy target detection training and testing.

Direct use:https://go.hyper.ai/1CqaN

7. Tecnalia Electrical Equipment Waste Hyperspectral Dataset

The Tecnalia hyperspectral dataset contains different non-ferrous metal fractions from electrical and electronic equipment waste, such as copper, brass, aluminum, stainless steel, and cupronickel, and the images contain 76 evenly distributed wavelengths in the spectral range [415.05 nm, 1008.10 nm].

Direct use:https://go.hyper.ai/1TBGz

8. Car Crash Prediction Car crash or prediction dataset

This dataset contains 10k dashcam images, all of which are from 100K Dashcam videos. The images are separated from the video at 5-second intervals as separate frames, and the dataset contains two classes: collision and no collision. Annotations are also provided in the xlsx file.

Direct use:https://go.hyper.ai/jV1hL

9. PKU-Market-PCB Printed Circuit Board Defect Dataset

PKU-Market-PCB is a public synthetic PCB dataset containing 1,386 images with 6 types of defects (leaky holes, rat bites, opens, shorts, strays, and stray copper) that can be used for image detection, classification, and registration tasks.

Direct use:https://go.hyper.ai/VnbpT

10. PKU-Market-Phone Mobile Phone Screen Surface Defect Segmentation Dataset

The dataset contains 3 types of surface defects: oil stains, scratches, and spots. There are 400 images of each type of defect, a total of 1.2k images. The defects were made by the research team to simulate the industrial environment. The images were collected by an industrial camera with a resolution of 1920×1080. The dataset is divided into training: validation: test = 6:2:2, and the dataset format uses PASCAL VOC.

Direct use:https://go.hyper.ai/K6u2o

For more public datasets, please visit:

https://hyper.ai/datasets

Community Articles

1. A complete disassembly of AlphaFold 3, Zhong Bozitao from Shanghai Jiaotong University: Making the most of data to predict all biomolecular structures with atomic precision, but it is not perfect

Recently, at the AI for Bioengineering Summer School event of Shanghai Jiao Tong University, Dr. Zhong Bozitao systematically sorted out his learning experience under the theme of "AlphaFold 3: Principles, Applications and Prospects", and widely sorted out many relevant research results from the scientific research community, sharing his profound insights into AlphaFold 3 with everyone. This article is a summary of the core content of the speech.

View the full report:https://go.hyper.ai/Ln2Yv

2. The cover article of the Proceedings of the National Academy of Sciences of the United States! A Chinese team released an AI-adaptive micro-spectrometer that can be produced at the wafer level

The Fudan University team proposed a new miniaturized reconstruction spectrometer design that combines the advantages of traditional spectrometers and computational reconstruction spectrometers. Through the integrated self-referencing narrowband filter channel, the artificial intelligence algorithm can simultaneously search for spectral and algorithm parameters in a higher-dimensional parameter space. This article is a detailed interpretation and sharing of the research paper.

View the full report:https://go.hyper.ai/GEKE4

3. Covering 7 million question-and-answer data, Shanghai AI Lab released ChemLLM, with professional capabilities comparable to GPT-4

The Shanghai Artificial Intelligence Laboratory has released a large chemical language model, ChemLLM. ChemLLM is good at performing various tasks in the chemical discipline through fluent conversational interactions. Its performance on core tasks is comparable to that of GPT-4, and it has shown comparable performance to LLMs of similar size in general scenarios. This article is a detailed interpretation and sharing of the research paper.

View the full report:https://go.hyper.ai/3bdMW

Popular Encyclopedia Articles

1. Reciprocal sorting fusion RRF

2. Learning Rate

3. Nuclear Norm

4. Pareto Front

5. Data Augmentation

Here are hundreds of AI-related terms compiled to help you understand "artificial intelligence" here:

https://go.hyper.ai/wiki

One-stop tracking of top AI academic conferences:https://go.hyper.ai/event

The above is all the content of this week’s editor’s selection. If you have resources that you want to include on the hyper.ai official website, you are also welcome to leave a message or submit an article to tell us!

See you next week!

About HyperAI

HyperAI (hyper.ai) is the leading artificial intelligence and high-performance computing community in China.We are committed to becoming the infrastructure in the field of data science in China and providing rich and high-quality public resources for domestic developers. So far, we have:

* Provide domestic accelerated download nodes for 1300+ public data sets

* Includes 400+ classic and popular online tutorials

* Interpretation of 100+ AI4Science paper cases

* Support 500+ related terms search

* Hosting the first complete Apache TVM Chinese documentation in China

Visit the official website to start your learning journey:

https://hyper.ai

HyperAI

One-click Deployment of Phi 3.5 mini+vision! Multimodal Reading Benchmark Dataset MRR-Benchmark Is Online, Including 550 question-answer Pairs

2 years ago

Information

Artificial Intelligence

Dataset

Machine Learning

Deep Learning

Natural Language Processing

HyperAI Super Neural Network has now launched the model deployment tutorials for the mini version and vision version. Scroll down to get the link~

From September 2nd to September 6th, hyper.ai official website updates:

* Selection of high-quality tutorials: 3

* High-quality public datasets: 10

* Community article selection: 3 articles

* Popular encyclopedia entries: 5

* Top conferences with deadline in September: 5

Visit the official website:hyper.ai

https://hdxu.cn/6Bjom

Selected Public Tutorials

1. One-click deployment of Phi-3.5-mini-instruct

Direct use:https://go.hyper.ai/F7smR

2. One-click deployment of Phi-3.5-vision-instruct

Direct use:https://go.hyper.ai/zN9Bx

3. Online Tutorial | Generate a 10,000-word suspense novel in 1 minute, LongWriter-glm4-9b breaks through the bottleneck of long text output

Direct use:https://go.hyper.ai/p6SiO

Selected public datasets

1. MRR-Benchmark Multimodal Reading Benchmark Dataset

Direct use:https://go.hyper.ai/deAmf

2. EveDentify pupil diameter estimation dataset

Direct use:https://go.hyper.ai/iHjxC

3. Traffic Road Object Detection Polish traffic road object detection dataset

Direct use:https://go.hyper.ai/Sl0k5

4. C2A Human Detection Dataset in Disaster Scenarios

Direct use:https://go.hyper.ai/15dMR

5. Skin Conditionsmage Dataset 6 skin condition datasets

This dataset contains enhanced images of 6 different skin diseases: acne, cancer, eczema, keratosis, milia, and rosacea. Each category contains 399 images, for a total of 2,394 images.

Direct use:https://go.hyper.ai/tWO7x

6. Penn-Fudan Pedestrian Detection and Segmentation Dataset

Direct use:https://go.hyper.ai/1CqaN

7. Tecnalia Electrical Equipment Waste Hyperspectral Dataset

Direct use:https://go.hyper.ai/1TBGz

8. Car Crash Prediction Car crash or prediction dataset

Direct use:https://go.hyper.ai/jV1hL

9. PKU-Market-PCB Printed Circuit Board Defect Dataset

Direct use:https://go.hyper.ai/VnbpT

10. PKU-Market-Phone Mobile Phone Screen Surface Defect Segmentation Dataset

Direct use:https://go.hyper.ai/K6u2o

For more public datasets, please visit:

https://hyper.ai/datasets

Community Articles

View the full report:https://go.hyper.ai/Ln2Yv

2. The cover article of the Proceedings of the National Academy of Sciences of the United States! A Chinese team released an AI-adaptive micro-spectrometer that can be produced at the wafer level

View the full report:https://go.hyper.ai/GEKE4

3. Covering 7 million question-and-answer data, Shanghai AI Lab released ChemLLM, with professional capabilities comparable to GPT-4

View the full report:https://go.hyper.ai/3bdMW

Popular Encyclopedia Articles

1. Reciprocal sorting fusion RRF

2. Learning Rate

3. Nuclear Norm

4. Pareto Front

5. Data Augmentation

Here are hundreds of AI-related terms compiled to help you understand "artificial intelligence" here:

https://go.hyper.ai/wiki

One-stop tracking of top AI academic conferences:https://go.hyper.ai/event

See you next week!

About HyperAI

* Provide domestic accelerated download nodes for 1300+ public data sets

* Includes 400+ classic and popular online tutorials

* Interpretation of 100+ AI4Science paper cases

* Support 500+ related terms search

* Hosting the first complete Apache TVM Chinese documentation in China

Visit the official website to start your learning journey:

https://hyper.ai

One-click Deployment of Phi 3.5 mini+vision! Multimodal Reading Benchmark Dataset MRR-Benchmark Is Online, Including 550 question-answer Pairs

2 years ago

Information

Artificial Intelligence

Dataset

Machine Learning

Deep Learning

Natural Language Processing

HyperAI Super Neural Network has now launched the model deployment tutorials for the mini version and vision version. Scroll down to get the link~

From September 2nd to September 6th, hyper.ai official website updates:

* Selection of high-quality tutorials: 3

* High-quality public datasets: 10

* Community article selection: 3 articles

* Popular encyclopedia entries: 5

* Top conferences with deadline in September: 5

Visit the official website:hyper.ai

https://hdxu.cn/6Bjom

Selected Public Tutorials

1. One-click deployment of Phi-3.5-mini-instruct

Direct use:https://go.hyper.ai/F7smR

2. One-click deployment of Phi-3.5-vision-instruct

Direct use:https://go.hyper.ai/zN9Bx

3. Online Tutorial | Generate a 10,000-word suspense novel in 1 minute, LongWriter-glm4-9b breaks through the bottleneck of long text output

Direct use:https://go.hyper.ai/p6SiO

Selected public datasets

1. MRR-Benchmark Multimodal Reading Benchmark Dataset

Direct use:https://go.hyper.ai/deAmf

2. EveDentify pupil diameter estimation dataset

Direct use:https://go.hyper.ai/iHjxC

3. Traffic Road Object Detection Polish traffic road object detection dataset

Direct use:https://go.hyper.ai/Sl0k5

4. C2A Human Detection Dataset in Disaster Scenarios

Direct use:https://go.hyper.ai/15dMR

5. Skin Conditionsmage Dataset 6 skin condition datasets

This dataset contains enhanced images of 6 different skin diseases: acne, cancer, eczema, keratosis, milia, and rosacea. Each category contains 399 images, for a total of 2,394 images.

Direct use:https://go.hyper.ai/tWO7x

6. Penn-Fudan Pedestrian Detection and Segmentation Dataset

Direct use:https://go.hyper.ai/1CqaN

7. Tecnalia Electrical Equipment Waste Hyperspectral Dataset

Direct use:https://go.hyper.ai/1TBGz

8. Car Crash Prediction Car crash or prediction dataset

Direct use:https://go.hyper.ai/jV1hL

9. PKU-Market-PCB Printed Circuit Board Defect Dataset

Direct use:https://go.hyper.ai/VnbpT

10. PKU-Market-Phone Mobile Phone Screen Surface Defect Segmentation Dataset

Direct use:https://go.hyper.ai/K6u2o

For more public datasets, please visit:

https://hyper.ai/datasets

Community Articles

View the full report:https://go.hyper.ai/Ln2Yv

2. The cover article of the Proceedings of the National Academy of Sciences of the United States! A Chinese team released an AI-adaptive micro-spectrometer that can be produced at the wafer level

View the full report:https://go.hyper.ai/GEKE4

3. Covering 7 million question-and-answer data, Shanghai AI Lab released ChemLLM, with professional capabilities comparable to GPT-4

View the full report:https://go.hyper.ai/3bdMW

Popular Encyclopedia Articles

1. Reciprocal sorting fusion RRF

2. Learning Rate

3. Nuclear Norm

4. Pareto Front

5. Data Augmentation

Here are hundreds of AI-related terms compiled to help you understand "artificial intelligence" here:

https://go.hyper.ai/wiki

One-stop tracking of top AI academic conferences:https://go.hyper.ai/event

See you next week!

About HyperAI

* Provide domestic accelerated download nodes for 1300+ public data sets

* Includes 400+ classic and popular online tutorials

* Interpretation of 100+ AI4Science paper cases

* Support 500+ related terms search

* Hosting the first complete Apache TVM Chinese documentation in China

Visit the official website to start your learning journey:

https://hyper.ai

Command Palette

One-click Deployment of Phi 3.5 mini+vision! Multimodal Reading Benchmark Dataset MRR-Benchmark Is Online, Including 550 question-answer Pairs

Command Palette

One-click Deployment of Phi 3.5 mini+vision! Multimodal Reading Benchmark Dataset MRR-Benchmark Is Online, Including 550 question-answer Pairs

Related News

Achieve "voice-over Freedom" With Just 3 Seconds of Audio: Mistral open-source Speech Model Voxtral-4B-TTS-2603; Set a New Benchmark for Data Quality: Sutra 10B Pretraining.

Meta Proposes AI Data Scientists, and Autodata Builds high-quality training/evaluation datasets.

4-step Image output/4K quality/6x Speedup, PiD Uses Pixel Diffusion to Unify Decoding and super-resolution Output; SA-3DAO: a Dataset Containing 1000 Pairs of Real Images Paired With Handcrafted 3D Meshes by artists.

Tencent open-sources Hy-MT1.5 Translation Model: 440MB Achieves top-tier Translation Capabilities; MIT Jointly Releases MathNet: a Multimodal Mathematical Inference Benchmark Covering 27,000 Real Olympiad Math problems.

Fast and Accurate! Cohere Releases open-source Transcription Model; Accurate Parsing of Complex Scenarios: Chandra-ocr-2 Visual Language Model Achieves Precise OCR.

Supports live-action/animation/animal-driven Video Generation; Meituan's open-source multi-style audio-driven Video Generation Framework LongCat 1.5 Enhances VLM's Chart Reconstruction and Table Extraction Capabilities Using the million-level Chart Understanding Dataset ChartNet.

Zero-sampling TTS Breakthrough! A Few Seconds of Reference Audio, OmniVoice Helps You Easily Clone Hundreds of Languages; 17 Languages All in One Go: MDPbench Solves the Major Problem of Parsing low-resource Text systems.

Anima V1, a brand-new Raw Image Model, Has Been Released, Focusing on anime-style Image Generation; the MemLens Multimodal long-range Memory Evaluation Dataset Covers cross-conversation text-to-image Reasoning and Knowledge Update mechanisms.

Extremely Lightweight, yet With Undiminished Image Quality! ERNIE-Image-Turbo: Say Goodbye to Long Waits, lightning-fast Speed; Introducing dual-dimensional Metrics of Perception and Cognition: Alibaba's Unified Multimodal Parsing and Evaluation Dataset OmniParsingBench Is Now online.

Command Palette

One-click Deployment of Phi 3.5 mini+vision! Multimodal Reading Benchmark Dataset MRR-Benchmark Is Online, Including 550 question-answer Pairs

Related News

Achieve "voice-over Freedom" With Just 3 Seconds of Audio: Mistral open-source Speech Model Voxtral-4B-TTS-2603; Set a New Benchmark for Data Quality: Sutra 10B Pretraining.

Meta Proposes AI Data Scientists, and Autodata Builds high-quality training/evaluation datasets.

4-step Image output/4K quality/6x Speedup, PiD Uses Pixel Diffusion to Unify Decoding and super-resolution Output; SA-3DAO: a Dataset Containing 1000 Pairs of Real Images Paired With Handcrafted 3D Meshes by artists.

Tencent open-sources Hy-MT1.5 Translation Model: 440MB Achieves top-tier Translation Capabilities; MIT Jointly Releases MathNet: a Multimodal Mathematical Inference Benchmark Covering 27,000 Real Olympiad Math problems.

Fast and Accurate! Cohere Releases open-source Transcription Model; Accurate Parsing of Complex Scenarios: Chandra-ocr-2 Visual Language Model Achieves Precise OCR.

Supports live-action/animation/animal-driven Video Generation; Meituan's open-source multi-style audio-driven Video Generation Framework LongCat 1.5 Enhances VLM's Chart Reconstruction and Table Extraction Capabilities Using the million-level Chart Understanding Dataset ChartNet.

Zero-sampling TTS Breakthrough! A Few Seconds of Reference Audio, OmniVoice Helps You Easily Clone Hundreds of Languages; 17 Languages All in One Go: MDPbench Solves the Major Problem of Parsing low-resource Text systems.

Anima V1, a brand-new Raw Image Model, Has Been Released, Focusing on anime-style Image Generation; the MemLens Multimodal long-range Memory Evaluation Dataset Covers cross-conversation text-to-image Reasoning and Knowledge Update mechanisms.

Extremely Lightweight, yet With Undiminished Image Quality! ERNIE-Image-Turbo: Say Goodbye to Long Waits, lightning-fast Speed; Introducing dual-dimensional Metrics of Perception and Cognition: Alibaba's Unified Multimodal Parsing and Evaluation Dataset OmniParsingBench Is Now online.

Related News

Achieve "voice-over Freedom" With Just 3 Seconds of Audio: Mistral open-source Speech Model Voxtral-4B-TTS-2603; Set a New Benchmark for Data Quality: Sutra 10B Pretraining.

Meta Proposes AI Data Scientists, and Autodata Builds high-quality training/evaluation datasets.

4-step Image output/4K quality/6x Speedup, PiD Uses Pixel Diffusion to Unify Decoding and super-resolution Output; SA-3DAO: a Dataset Containing 1000 Pairs of Real Images Paired With Handcrafted 3D Meshes by artists.

Tencent open-sources Hy-MT1.5 Translation Model: 440MB Achieves top-tier Translation Capabilities; MIT Jointly Releases MathNet: a Multimodal Mathematical Inference Benchmark Covering 27,000 Real Olympiad Math problems.

Fast and Accurate! Cohere Releases open-source Transcription Model; Accurate Parsing of Complex Scenarios: Chandra-ocr-2 Visual Language Model Achieves Precise OCR.

Supports live-action/animation/animal-driven Video Generation; Meituan's open-source multi-style audio-driven Video Generation Framework LongCat 1.5 Enhances VLM's Chart Reconstruction and Table Extraction Capabilities Using the million-level Chart Understanding Dataset ChartNet.

Zero-sampling TTS Breakthrough! A Few Seconds of Reference Audio, OmniVoice Helps You Easily Clone Hundreds of Languages; 17 Languages All in One Go: MDPbench Solves the Major Problem of Parsing low-resource Text systems.

Anima V1, a brand-new Raw Image Model, Has Been Released, Focusing on anime-style Image Generation; the MemLens Multimodal long-range Memory Evaluation Dataset Covers cross-conversation text-to-image Reasoning and Knowledge Update mechanisms.

Extremely Lightweight, yet With Undiminished Image Quality! ERNIE-Image-Turbo: Say Goodbye to Long Waits, lightning-fast Speed; Introducing dual-dimensional Metrics of Perception and Cognition: Alibaba's Unified Multimodal Parsing and Evaluation Dataset OmniParsingBench Is Now online.

Related News

Achieve "voice-over Freedom" With Just 3 Seconds of Audio: Mistral open-source Speech Model Voxtral-4B-TTS-2603; Set a New Benchmark for Data Quality: Sutra 10B Pretraining.

Meta Proposes AI Data Scientists, and Autodata Builds high-quality training/evaluation datasets.

4-step Image output/4K quality/6x Speedup, PiD Uses Pixel Diffusion to Unify Decoding and super-resolution Output; SA-3DAO: a Dataset Containing 1000 Pairs of Real Images Paired With Handcrafted 3D Meshes by artists.

Tencent open-sources Hy-MT1.5 Translation Model: 440MB Achieves top-tier Translation Capabilities; MIT Jointly Releases MathNet: a Multimodal Mathematical Inference Benchmark Covering 27,000 Real Olympiad Math problems.

Fast and Accurate! Cohere Releases open-source Transcription Model; Accurate Parsing of Complex Scenarios: Chandra-ocr-2 Visual Language Model Achieves Precise OCR.

Supports live-action/animation/animal-driven Video Generation; Meituan's open-source multi-style audio-driven Video Generation Framework LongCat 1.5 Enhances VLM's Chart Reconstruction and Table Extraction Capabilities Using the million-level Chart Understanding Dataset ChartNet.

Zero-sampling TTS Breakthrough! A Few Seconds of Reference Audio, OmniVoice Helps You Easily Clone Hundreds of Languages; 17 Languages All in One Go: MDPbench Solves the Major Problem of Parsing low-resource Text systems.

Anima V1, a brand-new Raw Image Model, Has Been Released, Focusing on anime-style Image Generation; the MemLens Multimodal long-range Memory Evaluation Dataset Covers cross-conversation text-to-image Reasoning and Knowledge Update mechanisms.

Extremely Lightweight, yet With Undiminished Image Quality! ERNIE-Image-Turbo: Say Goodbye to Long Waits, lightning-fast Speed; Introducing dual-dimensional Metrics of Perception and Cognition: Alibaba's Unified Multimodal Parsing and Evaluation Dataset OmniParsingBench Is Now online.