Weekly Editor's Picks | Free Sora Alternative, Online Python Basic Tutorial, MCFEND Chinese Fake News Detection Dataset Launched

After ChatGPT, OpenAI released another explosive product, the Sora video model. While other models are still struggling to break through the coherence problem within a few seconds, Sora has already extended the video length to 60 seconds. However, the Sora model is currently only accessible to a small number of researchers and creative people.
Don't worry, HyperAI provides you with an open source AI-generated video solution:Stable Diffusion + Prompt Travel + AnimateDiff combination, one-click start, free to use!Enjoy a visual feast in 1 second, Sora open source alternative is waiting for you! Online operation tutorials are now available on the hyper.ai official website, come and experience it!
From March 25 to March 29, hyper.ai official website updates:
* High-quality public datasets: 10
* Selected high-quality tutorials: 2
* Community article selection: 4 articles
* Popular encyclopedia entries: 5
Visit the official website:hyper.ai
Selected public datasets
1. MCFEND A multi-source benchmark dataset for fake news detection in China
The MCFEND dataset is a multi-source Chinese fake news detection benchmark dataset jointly constructed by Hong Kong Baptist University, the Chinese University of Hong Kong and other institutions. The dataset collects news from diverse sources such as social platforms, instant messaging applications and traditional online news media, totaling 23,974 pieces, all of which have been verified by 14 international authoritative fact-checking organizations.
Direct use:
2. Fin-Eva Version 1.0 Chinese language professional data evaluation set in the financial field
Fin-Eva Version 1.0 is a financial evaluation dataset jointly launched by Ant Group and Shanghai University of Finance and Economics. It covers multiple financial scenarios such as wealth management, insurance, and investment research, as well as financial professional subject subjects, with a total number of evaluation questions exceeding 13,000.
Direct use:
3. VidProM Large-Scale Text-to-Video Prompt Dataset
The VidProM dataset is the first large-scale real-user text-to-video prompt dataset jointly developed by the University of Technology Sydney and Zhejiang University. It contains 1.67 million unique text-to-video prompts and 6.69 million videos generated by four state-of-the-art diffusion models.
Direct use:
4. FindingEmo Image Emotion Recognition Dataset
FindingEmo is a new image dataset built by KU Leuven and other institutions, specifically for emotion recognition tasks. The dataset contains annotations for 25,000 images.
Direct use:
5. GPD Crowd Flow and Traffic Speed Dataset
The latest research result of the Center for Urban Science and Computing, Department of Electronic Engineering, Tsinghua University, "Spatio-Temporal Few-Shot Learning via Diffusive Neural Network Generation", was accepted by ICLR2024. The research proposed the GPD (Generative Pre-Trained Diffusion) model to achieve spatiotemporal learning in data sparse scenarios. This dataset is the open source data and code of the paper.
Direct use:
6. AlgoPuzzleVQA Multimodal Algorithm Puzzle Dataset
The AlgoPuzzleVQA dataset is a multimodal reasoning dataset constructed by the Singapore University of Technology and Design to challenge and evaluate the ability of multimodal language models in solving algorithmic puzzles that require visual understanding, language understanding, and complex algorithmic reasoning.
Direct use:
7. UltraSafety Large Model Safety Evaluation Dataset
The UltraSafety dataset was jointly created by Renmin University, Tsinghua University, and Tencent to evaluate and improve the safety of large models. The dataset is derived from AdvBench and MaliciousInstruct to derive 1,000 safety seed instructions, and Self-Instruct to guide another 2,000 instructions.
Direct use:
8. NAIP-S2 USNational super-resolution remote sensing dataset
NAIP-S2 is a super-resolution remote sensing dataset released by the Allen Institute for Artificial Intelligence, which contains paired NAIP and Sentinel-2 images of the continental United States. This dataset has a wide range of applications in the field of remote sensing science, especially in surface monitoring, resource management, and environmental change assessment, providing high-precision data support.
Direct use:
9. CLIcK Korean Culture and Language Intelligence Dataset
The CLIcK dataset was created by the Korea Advanced Institute of Science and Technology to fill the gap in the assessment of cultural and linguistic knowledge in Korean large models. The dataset contains 1,995 question-answer pairs from official Korean exams and textbooks, covering two categories: language and culture, divided into 11 subcategories, and each sample provides fine-grained annotations to indicate the cultural and linguistic knowledge required to answer the question.
Direct use:
10. Data used in TacticAI research
This dataset is the relevant data collected in the "TacticAI: Football Tactical Artificial Intelligence Assistant" research.
Direct use:
For more public datasets, please visit:
Selected Public Tutorials
1. Generate Random Numbers in Python
True random numbers are difficult to achieve in computers because computers can only perform specified operations. Pseudo-randomness is possible and can be simulated by programs. This tutorial will show you step by step how to generate random numbers in Python.
Run online:
2. Developing Neural Networks Step by Step Using PyTorch
PyTorch is a powerful Python library for building deep learning models that simplifies the process of defining, training, and reasoning about neural networks. This tutorial will show you how to load a CSV dataset, define a multilayer perceptron model, and train and evaluate it in PyTorch, providing guidance for creating deep learning neural network models.
Run online:
Community Articles
This tutorial is a combination of open source AI video generation solutions Stable Diffusion + Prompt Travel + AnimateDiff, which is free for everyone to use. Start Sora open source alternative with one click, and enjoy a visual feast in 1 second.
Run online:
Professor Liu Shao's team from the Department of Pharmacy, Xiangya Hospital, Central South University, has established an integrated molecular network framework (IMN4NPD) that can comprehensively mine the pharmacological components of natural medicines. It not only accelerates the dereplication of extensive clusters in molecular networks, but also provides annotations for self-loops and paired nodes that are often overlooked in existing research methods. The relevant results have been published in the journal "Bioinformatics".
View the full report:
Researchers from Central South University proposed a method called AdaDR, which deeply integrates node features and topological structures to perform drug repositioning, and simulates the interactive information between them based on adaptive graph convolution operations, thereby enhancing the model's expressive power. The related paper has been published in "American Chemical Society".
View the full report:
Corner kicks are often a great opportunity to execute coaching tactics. To this end, Google DeepMind and Liverpool Football Club jointly launched TacticAI, which uses geometric deep learning methods and predictive and generative models to provide professionals with insights into corner kick tactics. The results show that the tactical layout proposed by TacticAI is favored by human expert evaluators in 90% cases. The accuracy of receiving the ball prediction is as high as 74%, and the shooting opportunity is increased by 13%. The relevant results have been published in the journal "Nature".
View the full report:
Popular Encyclopedia Articles
1. Paired t-Test
2. Representation learning
3. Rotational Position Encoding RoPE
4. Cognitive Search
5. Case-Based Reasoning (CBR)
Here are hundreds of AI-related terms compiled to help you understand "artificial intelligence" here:
Station B live broadcast preview
date | time | content |
April 1 Monday | 10:00 | Harvard CS50 Course (2023) |
Tuesday, April 2 | 10:00 | Harvard CS50 Course (2022) |
Wednesday, April 3 | 10:00 | MIT Deep Learning Course |
Thursday, April 4 | 10:00 | NVIDIA's press conferences over the years |
Friday, April 5 | 10:0018:00 | Machine Learning Compilation Course Tesla AI Day |
Saturday, April 6 | 10:00 | Google IO conferences over the years |
April 7 Sunday | 10:00 | Stanford HAI Symposium |
Super Neuro TV broadcasts live 24/7. Click to get the "electronic pickles" in the AI field:
http://live.bilibili.com/26483094
The above is all the content of this week’s editor’s selection. If you have resources that you want to include on the hyper.ai official website, you are also welcome to leave a message or submit an article to tell us!
See you next week!
About HyperAI
HyperAI (hyper.ai) is the leading artificial intelligence and high-performance computing community in China.We are committed to becoming the infrastructure in the field of data science in China and providing rich and high-quality public resources for domestic developers. So far, we have:
* Provide domestic accelerated download nodes for 1200+ public data sets
* Includes 300+ classic and popular online tutorials
* Interpretation of 100+ AI4Science paper cases
* Support 500+ related terms search
* Hosting the first complete Apache TVM Chinese documentation in China
Visit the official website to start your learning journey: