NeurIPS 2024 Dataset Summary｜Cover Cloud Removal/Chemical Spectroscopy/Singing Audio/Autonomous Driving/Insect Specimens······

2 years ago

NeurIPS, the full name of Neural Information Processing Systems Conference, is an annual academic conference on neural information processing systems. The conference started in 1987, when it was called NIPS. With the rapid development of the field of artificial intelligence, its influence has gradually expanded, and it has been paid attention to and known by more and more researchers and companies. In order to better reflect the wide range of fields covered by the conference, NIPS was officially renamed NeurIPS in 2017.

Today, NeurIPS has become one of the most authoritative academic conferences in the field of artificial intelligence in the world, attracting scholars, entrepreneurs, and researchers from all over the world.

This year is the 38th NeurIPS (NeurIPS 2024), and the academic achievements are still grand. It is reported that this year a total of 15,671 valid submissions were received, and about 4,000 papers were finally accepted.

HyperAI has compiled 9 high-quality open source datasets from the datasets received at the conference.Covering cloud removal, chemical spectra, singing audio, autonomous driving, insect specimens and many other aspects, you can download it as needed~

Click here to learn more about the summit:
https://go.hyper.ai/vWvAW

Scan the QR code and remark "dataset" to join the discussion group↓

NeurIPS 2024 Dataset Summary

1 , AllClear Public Cloud Removal Dataset

Publishing Agency:Cornell University, Columbia University

Estimated size:22.42 GB

Download address:https://go.hyper.ai/iRqtm

Clouds in satellite images pose a significant challenge for downstream applications. A major problem facing current cloud removal research is the lack of comprehensive benchmarks and sufficiently large and diverse training datasets. AllClear is currently the largest public cloud removal dataset, containing 23,742 globally distributed regions of interest (ROIs), covering a variety of land use patterns, and a total of 4 million images.

2. Muharaf Handwritten Arabic Dataset

Publishing Agency:North Carolina State University, Holy Spirit University of Kaslik, Lebanese Historical Society

Estimated size:9.83 GB

Download address:https://go.hyper.ai/yztH6

The Muharaf dataset is a machine learning dataset focused on handwritten Arabic recognition, containing more than 1.6k images of historical handwritten pages transcribed by archival Arabic experts. Each document image is accompanied by the spatial polygon coordinates of its text lines and information about basic page elements, aiming to advance the state of the art in the field of handwritten text recognition (HTR).

3 ,Chemical Multimodal Spectroscopic Datasets

Publishing Agency:IBM Research, University of Zurich, EPFL, NCCR Catalysis

Estimated size:9.7 GB

Download address:https://go.hyper.ai/ZdXk8

The dataset contains simulated 1H-NMR, 13C-NMR, HSQC-NMR, infrared and mass spectrometry (positive and negative ion modes) spectral data of 790,000 molecules extracted from chemical reactions in patent data. The core value of this dataset lies in its ability to integrate information from multiple spectral modalities and simulate the method of human experts analyzing molecular structures, which is expected to automate structural analysis and simplify the molecular discovery process from synthesis to structure determination.

4 , GTSinger singing audio dataset

Publishing Agency:Zhejiang University

Estimated size:28.94 GB

Download address:https://go.hyper.ai/7jdi2

The dataset contains 80.59 hours of singing recorded in professional studios by 20 professional singers in 9 different languages, including Chinese, English, Japanese, Korean, etc., providing researchers with a resource library with extremely rich timbres and styles.

5 , DrivingDojo Autonomous Driving Dataset

Publishing Agency:Chinese Academy of Sciences, Meituan, Artificial Intelligence and Robotics Center of the Hong Kong Innovation Institute of the Chinese Academy of Sciences

Download address:https://go.hyper.ai/W3eDT

The dataset contains about 18k video clips, covering cities such as Beijing, Shenzhen, and Xuzhou, and recorded under different weather conditions and daylight conditions. It includes not only longitudinal operations such as acceleration, emergency braking, and stop-start, but also lateral operations such as U-turns, overtaking, and lane changes. In addition, the dataset is specially designed with videos containing a large number of multi-agent interaction trajectories, aiming to improve the prediction and control capabilities of the world model in complex driving environments.

6 ,Multimodal insect biodiversity dataset

Publishing Agency:Centre for Biodiversity Genomics, University of Guelph, University of Waterloo, etc.

Estimated size:37.71 GB

Download address:https://go.hyper.ai/Ljjwp

The BIOSCAN-5M dataset contains detailed information on more than 5 million insect specimens, significantly expanding existing image-based biological datasets. It not only includes classification labels, raw nucleotide barcode sequences, assigned barcode index numbers and geographic information, but also covers multimodal information such as specimen size, aiming to understand and monitor global insect biodiversity.

7 , OpenSatMap high-resolution satellite dataset

Publishing Agency:Chinese Academy of Sciences, Artificial Intelligence and Robotics Research Center, Hong Kong Institute of Information Systems, Chinese Academy of Sciences, Tencent Maps and Beijing University of Posts and Telecommunications

Estimated size:57.7 GB

Download address:https://go.hyper.ai/g54aa

This dataset is a high-resolution satellite dataset designed for large-scale map construction. It features fine-grained instance-level annotations and high-resolution images, and contains 3,787 high-resolution satellite images, including images of not only multiple cities in China, but also images of more than 50 cities and 18 countries around the world.

8 ,Natural Species Sound Dataset

Publishing Agency:University of Massachusetts Amherst, iNaturalist

Estimated size:131.26 GB

Download address:https://go.hyper.ai/lyTcc

The dataset contains 230,000 audio files capturing sounds from more than 5,500 species contributed by more than 27,000 recorders worldwide. The dataset contains the sounds of birds, mammals, insects, reptiles, and amphibians, with audio and species labels derived from observations submitted to iNaturalist.

9 , MINT-1T Text-Image Pair Multimodal Dataset

Publishing Agency:University of Washington, Stanford University, Salesforce Research, etc.

Download address:https://go.hyper.ai/kROfu

The dataset contains 1 trillion text tags and 3.4 billion images, which is 10 times larger than the previous largest open source dataset. It includes not only HTML documents, but also PDF documents and ArXiv papers, and its diversity significantly improves the coverage of scientific documents.

10 , AudioSetCaps audio subtitle dataset

Publishing Agency:Northwestern Polytechnical University, Xi'an Lianfeng Acoustic Technology Co., Ltd., Nanyang Technological University, Institute of Acoustics, Chinese Academy of Sciences, etc.

Download address:https://go.hyper.ai/rTKdU

AudioSetCaps is an audio-caption dataset, which comes from AudioSet, YouTube-8M and VGGSound, and contains 6,117,099 10-second audio files. Each audio file is accompanied by a descriptive title and 3 Q&A pairs as metadata for generating the final title (a total of 18,414,789 pairs of Q&A data).

The above is the NeurIPS 2024 dataset compiled by HyperAI. If you have resources that you want to include on the hyper.ai official website, you are also welcome to leave a message or submit a contribution to tell us!

About HyperAI

HyperAI (hyper.ai) is the leading artificial intelligence and high-performance computing community in China.We are committed to becoming the infrastructure in the field of data science in China and providing rich and high-quality public resources for domestic developers. So far, we have:

* Provide domestic accelerated download nodes for 1300+ public data sets

* Includes 400+ classic and popular online tutorials

* Interpretation of 200+ AI4Science paper cases

* Support 500+ related terms search

* Hosting the first complete Apache TVM Chinese documentation in China

Visit the official website to start your learning journey:

https://hyper.ai

NeurIPS 2024 Dataset Summary｜Cover Cloud Removal/Chemical Spectroscopy/Singing Audio/Autonomous Driving/Insect Specimens······

2 years ago

Information

Artificial Intelligence

Click here to learn more about the summit:
https://go.hyper.ai/vWvAW

Scan the QR code and remark "dataset" to join the discussion group↓

NeurIPS 2024 Dataset Summary

1 , AllClear Public Cloud Removal Dataset

Publishing Agency:Cornell University, Columbia University

Estimated size:22.42 GB

Download address:https://go.hyper.ai/iRqtm

2. Muharaf Handwritten Arabic Dataset

Publishing Agency:North Carolina State University, Holy Spirit University of Kaslik, Lebanese Historical Society

Estimated size:9.83 GB

Download address:https://go.hyper.ai/yztH6

3 ,Chemical Multimodal Spectroscopic Datasets

Publishing Agency:IBM Research, University of Zurich, EPFL, NCCR Catalysis

Estimated size:9.7 GB

Download address:https://go.hyper.ai/ZdXk8

4 , GTSinger singing audio dataset

Publishing Agency:Zhejiang University

Estimated size:28.94 GB

Download address:https://go.hyper.ai/7jdi2

5 , DrivingDojo Autonomous Driving Dataset

Publishing Agency:Chinese Academy of Sciences, Meituan, Artificial Intelligence and Robotics Center of the Hong Kong Innovation Institute of the Chinese Academy of Sciences

Download address:https://go.hyper.ai/W3eDT

6 ,Multimodal insect biodiversity dataset

Publishing Agency:Centre for Biodiversity Genomics, University of Guelph, University of Waterloo, etc.

Estimated size:37.71 GB

Download address:https://go.hyper.ai/Ljjwp

7 , OpenSatMap high-resolution satellite dataset

Estimated size:57.7 GB

Download address:https://go.hyper.ai/g54aa

8 ,Natural Species Sound Dataset

Publishing Agency:University of Massachusetts Amherst, iNaturalist

Estimated size:131.26 GB

Download address:https://go.hyper.ai/lyTcc

9 , MINT-1T Text-Image Pair Multimodal Dataset

Publishing Agency:University of Washington, Stanford University, Salesforce Research, etc.

Download address:https://go.hyper.ai/kROfu

10 , AudioSetCaps audio subtitle dataset

Publishing Agency:Northwestern Polytechnical University, Xi'an Lianfeng Acoustic Technology Co., Ltd., Nanyang Technological University, Institute of Acoustics, Chinese Academy of Sciences, etc.

Download address:https://go.hyper.ai/rTKdU

About HyperAI

* Provide domestic accelerated download nodes for 1300+ public data sets

* Includes 400+ classic and popular online tutorials

* Interpretation of 200+ AI4Science paper cases

* Support 500+ related terms search

* Hosting the first complete Apache TVM Chinese documentation in China

Visit the official website to start your learning journey:

https://hyper.ai

Command Palette

NeurIPS 2024 Dataset Summary｜Cover Cloud Removal/Chemical Spectroscopy/Singing Audio/Autonomous Driving/Insect Specimens······

NeurIPS 2024 Dataset Summary

About HyperAI

Command Palette

NeurIPS 2024 Dataset Summary｜Cover Cloud Removal/Chemical Spectroscopy/Singing Audio/Autonomous Driving/Insect Specimens······

NeurIPS 2024 Dataset Summary

About HyperAI

Related News

Supports live-action/animation/animal-driven Video Generation; Meituan's open-source multi-style audio-driven Video Generation Framework LongCat 1.5 Enhances VLM's Chart Reconstruction and Table Extraction Capabilities Using the million-level Chart Understanding Dataset ChartNet.

Paper Weekly Report | ProgramBench Enables AI to Write Software From Scratch, With 9 Major Models Failing En Masse; ExoActor Demonstrates Strong Scene Generalization Ability Without Additional real-world Data… A Quick Overview of the week's cutting-edge AI Papers

Tencent open-sources Hy-MT1.5 Translation Model: 440MB Achieves top-tier Translation Capabilities; MIT Jointly Releases MathNet: a Multimodal Mathematical Inference Benchmark Covering 27,000 Real Olympiad Math problems.

A Locally Runnable Privacy Detection Model: Privacy Filter Achieves high-quality PII Filtering at Low Cost; Hardcore Open Source! Covering the Transfermarkt Structured Football Dataset With Over 80,000 matches.

Flood Forecasting Performance Comparable to the U.S. National Weather Service; the knowledge-guided Machine Learning Model FHNN Improves Forecasting Accuracy by Combining real-time Observation data.

Zero-sampling TTS Breakthrough! A Few Seconds of Reference Audio, OmniVoice Helps You Easily Clone Hundreds of Languages; 17 Languages All in One Go: MDPbench Solves the Major Problem of Parsing low-resource Text systems.

Achieve "voice-over Freedom" With Just 3 Seconds of Audio: Mistral open-source Speech Model Voxtral-4B-TTS-2603; Set a New Benchmark for Data Quality: Sutra 10B Pretraining.

Extremely Lightweight, yet With Undiminished Image Quality! ERNIE-Image-Turbo: Say Goodbye to Long Waits, lightning-fast Speed; Introducing dual-dimensional Metrics of Perception and Cognition: Alibaba's Unified Multimodal Parsing and Evaluation Dataset OmniParsingBench Is Now online.

Paper Weekly Report | Microsoft MAI-Thinking Explores self-evolution of Pure RL, Achieving an AIME Accuracy of 97%; VLM³ Achieves 3D Task Generalization Using Plain Text Coordinates Without Architectural Modifications… A Quick Overview of the week's cutting-edge AI Papers

Command Palette

NeurIPS 2024 Dataset Summary｜Cover Cloud Removal/Chemical Spectroscopy/Singing Audio/Autonomous Driving/Insect Specimens······

NeurIPS 2024 Dataset Summary

About HyperAI

Related News

Supports live-action/animation/animal-driven Video Generation; Meituan's open-source multi-style audio-driven Video Generation Framework LongCat 1.5 Enhances VLM's Chart Reconstruction and Table Extraction Capabilities Using the million-level Chart Understanding Dataset ChartNet.

Paper Weekly Report | ProgramBench Enables AI to Write Software From Scratch, With 9 Major Models Failing En Masse; ExoActor Demonstrates Strong Scene Generalization Ability Without Additional real-world Data… A Quick Overview of the week's cutting-edge AI Papers

Tencent open-sources Hy-MT1.5 Translation Model: 440MB Achieves top-tier Translation Capabilities; MIT Jointly Releases MathNet: a Multimodal Mathematical Inference Benchmark Covering 27,000 Real Olympiad Math problems.

A Locally Runnable Privacy Detection Model: Privacy Filter Achieves high-quality PII Filtering at Low Cost; Hardcore Open Source! Covering the Transfermarkt Structured Football Dataset With Over 80,000 matches.

Flood Forecasting Performance Comparable to the U.S. National Weather Service; the knowledge-guided Machine Learning Model FHNN Improves Forecasting Accuracy by Combining real-time Observation data.

Zero-sampling TTS Breakthrough! A Few Seconds of Reference Audio, OmniVoice Helps You Easily Clone Hundreds of Languages; 17 Languages All in One Go: MDPbench Solves the Major Problem of Parsing low-resource Text systems.

Achieve "voice-over Freedom" With Just 3 Seconds of Audio: Mistral open-source Speech Model Voxtral-4B-TTS-2603; Set a New Benchmark for Data Quality: Sutra 10B Pretraining.

Extremely Lightweight, yet With Undiminished Image Quality! ERNIE-Image-Turbo: Say Goodbye to Long Waits, lightning-fast Speed; Introducing dual-dimensional Metrics of Perception and Cognition: Alibaba's Unified Multimodal Parsing and Evaluation Dataset OmniParsingBench Is Now online.

Paper Weekly Report | Microsoft MAI-Thinking Explores self-evolution of Pure RL, Achieving an AIME Accuracy of 97%; VLM³ Achieves 3D Task Generalization Using Plain Text Coordinates Without Architectural Modifications… A Quick Overview of the week's cutting-edge AI Papers

Related News

Supports live-action/animation/animal-driven Video Generation; Meituan's open-source multi-style audio-driven Video Generation Framework LongCat 1.5 Enhances VLM's Chart Reconstruction and Table Extraction Capabilities Using the million-level Chart Understanding Dataset ChartNet.

Paper Weekly Report | ProgramBench Enables AI to Write Software From Scratch, With 9 Major Models Failing En Masse; ExoActor Demonstrates Strong Scene Generalization Ability Without Additional real-world Data… A Quick Overview of the week's cutting-edge AI Papers

Tencent open-sources Hy-MT1.5 Translation Model: 440MB Achieves top-tier Translation Capabilities; MIT Jointly Releases MathNet: a Multimodal Mathematical Inference Benchmark Covering 27,000 Real Olympiad Math problems.

A Locally Runnable Privacy Detection Model: Privacy Filter Achieves high-quality PII Filtering at Low Cost; Hardcore Open Source! Covering the Transfermarkt Structured Football Dataset With Over 80,000 matches.

Flood Forecasting Performance Comparable to the U.S. National Weather Service; the knowledge-guided Machine Learning Model FHNN Improves Forecasting Accuracy by Combining real-time Observation data.

Zero-sampling TTS Breakthrough! A Few Seconds of Reference Audio, OmniVoice Helps You Easily Clone Hundreds of Languages; 17 Languages All in One Go: MDPbench Solves the Major Problem of Parsing low-resource Text systems.

Achieve "voice-over Freedom" With Just 3 Seconds of Audio: Mistral open-source Speech Model Voxtral-4B-TTS-2603; Set a New Benchmark for Data Quality: Sutra 10B Pretraining.

Extremely Lightweight, yet With Undiminished Image Quality! ERNIE-Image-Turbo: Say Goodbye to Long Waits, lightning-fast Speed; Introducing dual-dimensional Metrics of Perception and Cognition: Alibaba's Unified Multimodal Parsing and Evaluation Dataset OmniParsingBench Is Now online.

Paper Weekly Report | Microsoft MAI-Thinking Explores self-evolution of Pure RL, Achieving an AIME Accuracy of 97%; VLM³ Achieves 3D Task Generalization Using Plain Text Coordinates Without Architectural Modifications… A Quick Overview of the week's cutting-edge AI Papers

Related News

Supports live-action/animation/animal-driven Video Generation; Meituan's open-source multi-style audio-driven Video Generation Framework LongCat 1.5 Enhances VLM's Chart Reconstruction and Table Extraction Capabilities Using the million-level Chart Understanding Dataset ChartNet.

Paper Weekly Report | ProgramBench Enables AI to Write Software From Scratch, With 9 Major Models Failing En Masse; ExoActor Demonstrates Strong Scene Generalization Ability Without Additional real-world Data… A Quick Overview of the week's cutting-edge AI Papers

Tencent open-sources Hy-MT1.5 Translation Model: 440MB Achieves top-tier Translation Capabilities; MIT Jointly Releases MathNet: a Multimodal Mathematical Inference Benchmark Covering 27,000 Real Olympiad Math problems.

A Locally Runnable Privacy Detection Model: Privacy Filter Achieves high-quality PII Filtering at Low Cost; Hardcore Open Source! Covering the Transfermarkt Structured Football Dataset With Over 80,000 matches.

Flood Forecasting Performance Comparable to the U.S. National Weather Service; the knowledge-guided Machine Learning Model FHNN Improves Forecasting Accuracy by Combining real-time Observation data.

Zero-sampling TTS Breakthrough! A Few Seconds of Reference Audio, OmniVoice Helps You Easily Clone Hundreds of Languages; 17 Languages All in One Go: MDPbench Solves the Major Problem of Parsing low-resource Text systems.

Achieve "voice-over Freedom" With Just 3 Seconds of Audio: Mistral open-source Speech Model Voxtral-4B-TTS-2603; Set a New Benchmark for Data Quality: Sutra 10B Pretraining.

Extremely Lightweight, yet With Undiminished Image Quality! ERNIE-Image-Turbo: Say Goodbye to Long Waits, lightning-fast Speed; Introducing dual-dimensional Metrics of Perception and Cognition: Alibaba's Unified Multimodal Parsing and Evaluation Dataset OmniParsingBench Is Now online.

Paper Weekly Report | Microsoft MAI-Thinking Explores self-evolution of Pure RL, Achieving an AIME Accuracy of 97%; VLM³ Achieves 3D Task Generalization Using Plain Text Coordinates Without Architectural Modifications… A Quick Overview of the week's cutting-edge AI Papers