Text Classification
Text Classification是自然语言处理中的核心任务,旨在将文本数据归类到预定义的类别中。该任务通过分析文本内容,识别其主题、情感或意图等特征,实现高效的信息组织与检索。近年来,深度学习模型如XLNet和RoBERTa显著提升了文本分类的性能,推动了技术的发展。Benchmark数据集如GLUE和AGNews被广泛用于评估模型的效果。
20 Newsgroups
RoBERTaGCN
20NEWS
RoBERTaGCN
Adverse Drug Events (ADE) Corpus
ade_corpus_v2Ade_corpus_v2_classification
AffCon 2020 Emotion Detection
AG News
Amazon-2
Amazon-5
amazon_reviews_multi
An Amharic News Text classification Dataset
Naive Bayes using Tf-idf features
Arxiv HEP-TH citation graph
BigBird
arXiv-10
Protoformer
BANKING77
BLURB
BioLinkBERT (large)
book-text-classifier
catalonia_independence
clinc_oos
DBpedia
XLNet
DODF Data
ULMFiT (pre-trained vocab, no gradual unfreezing)
emotion
Facebook Media
financial_phrasebank
FMC-MWO2KG
Flair
GLUE
GLUE COLA
GLUE MRPC
GLUE QQP
GLUE RTE
GLUE SST2
GLUE STSB
hate_speech18
HateXplain
Hyperpartisan News Detection
BigBird
Hyperpartisan
IMDb
IMDb Movie Reviews
Logistic Regression
KLUE
Lot-insts
Character-BERT+RS
MNIST
MR
MTEB
ST5-XXL
MuLD (Character Type)
MVICTOR (type)
New_York_Times_Topics
NewsDiscourse
NICE-2
NICE-45
NSFW-Safe-Dataset
Ohsumed
SGCN
OneStopEnglish (Readability Assessment)
RoBERTa-RF-T1 hybrid
Overruling
Custom Legal-BERT
Patents
BigBird
R52
1-6 BertGCN
R8
RoBERTaGCN
RCV1
HiLAP (bow-CNN)
RusAge: Corpus for Age-Based Text Classification
LSVC + linguistic features + publishing attributes
Searchsnippets
SemEval 2014 Task 4 (Restaurants)
SILICONE Benchmark
Social media attributions of YouTube comments
Sogou News
BERT-ITPT-FiT
SST-2
SST2
STOPS-2
ERNIE 2.0
STOPS-41
SVICTOR (type)
tecla
Terms of Service
This is not a Dataset
ThreatGram 101 - Extreme Telegram Data
GPT-2
TRAC2-Benghali. Task 2.
BERT
TRAC2-English. Task2.
TREC-10
BERT
TREC-50
TREC-6
Automatic Label Error Correction
Twitter
Twitter Sentiment Analysis
Logistic Regression
Twitter-US
UK Key Stage Readability
Unknown
WeeBit (Readability Assessment)
BERT-FP-LBL
WNUT-2020 Task 2
NutCracker
Yahoo! Answers
BERT-ITPT-FiT
Yelp-2
Yelp-5
HAHNN (CNN)