Natural Language Processing
Performance metrics of mainstream AI models across various tasks, showcasing the state-of-the-art technology
AI Model Performance Benchmarks
Performance metrics of mainstream AI models across various tasks, showcasing the state-of-the-art technology
Deep Clustering
50 papers | 5 benchmarks
Semantic Dependency Parsing
50 papers | 3 benchmarks
Word Alignment
50 papers | 7 benchmarks
Few-Shot Text Classification
49 papers | 8 benchmarks
Lemmatization
49 papers | 0 benchmarks
Multimodal Deep Learning
49 papers | 1 benchmarks
Punctuation Restoration
49 papers | 0 benchmarks
Sentence Compression
49 papers | 1 benchmarks
Sentence Ordering
49 papers | 1 benchmarks
Graph-to-Sequence
48 papers | 2 benchmarks
In-Context Learning
48 papers | 0 benchmarks
Relation Extraction
48 papers | 50 benchmarks
Review Generation
48 papers | 0 benchmarks
Rumour Detection
48 papers | 2 benchmarks
Chatbot
47 papers | 1 benchmarks
Dialogue State Tracking
47 papers | 7 benchmarks
Entity Disambiguation
47 papers | 11 benchmarks
Grammatical Error Detection
47 papers | 4 benchmarks
Lexical Normalization
47 papers | 1 benchmarks
Lexical Simplification
47 papers | 0 benchmarks
Semantic Parsing
47 papers | 20 benchmarks
Text Categorization
47 papers | 0 benchmarks
Conversational Response Selection
46 papers | 15 benchmarks
Conversational Search
46 papers | 0 benchmarks
Dialogue Management
46 papers | 0 benchmarks
Document Summarization
46 papers | 7 benchmarks
Goal-Oriented Dialogue Systems
46 papers | 0 benchmarks
Hope Speech Detection
46 papers | 2 benchmarks
Benchmarking
45 papers | 2 benchmarks
Blocking
45 papers | 5 benchmarks
Dependency Parsing
45 papers | 15 benchmarks
Emotion-Cause Pair Extraction
45 papers | 2 benchmarks
Empathetic Response Generation
45 papers | 1 benchmarks
Extractive Text Summarization
45 papers | 5 benchmarks
Generative Question Answering
45 papers | 2 benchmarks
knowledge editing
45 papers | 1 benchmarks
Sentence Embeddings
45 papers | 0 benchmarks
Twitter Sentiment Analysis
45 papers | 0 benchmarks
Decipherment
44 papers | 0 benchmarks
GSM8K
44 papers | 1 benchmarks
Lexical Complexity Prediction
44 papers | 0 benchmarks
Morphological Tagging
44 papers | 0 benchmarks
TAR
44 papers | 0 benchmarks
Text Augmentation
44 papers | 0 benchmarks
Automated Essay Scoring
43 papers | 1 benchmarks
Chinese Word Segmentation
43 papers | 6 benchmarks
Novelty Detection
43 papers | 0 benchmarks
Prompt Engineering
43 papers | 16 benchmarks
Sentence Embedding
43 papers | 0 benchmarks
Sentence Summarization
42 papers | 0 benchmarks
Answer Generation
42 papers | 2 benchmarks
Arabic Sentiment Analysis
42 papers | 0 benchmarks
Cross-Lingual NER
42 papers | 28 benchmarks
Relation Classification
42 papers | 8 benchmarks
Spoken Language Understanding
42 papers | 5 benchmarks
Ad-Hoc Information Retrieval
41 papers | 1 benchmarks
Event Extraction
41 papers | 9 benchmarks
Learning with noisy labels
41 papers | 20 benchmarks
Named Entity Recognition (NER)
41 papers | 77 benchmarks
Reinforcement Learning
41 papers | 21 benchmarks
Safety Alignment
41 papers | 0 benchmarks
Aspect Extraction
40 papers | 6 benchmarks
Dialogue Evaluation
40 papers | 2 benchmarks
Hallucination Evaluation
40 papers | 0 benchmarks
Multimodal Sentiment Analysis
40 papers | 5 benchmarks
Continual Learning
39 papers | 32 benchmarks
Dialogue Generation
39 papers | 13 benchmarks
Distractor Generation
39 papers | 1 benchmarks
Intent Discovery
39 papers | 3 benchmarks
Knowledge Base Population
39 papers | 1 benchmarks
Script Generation
39 papers | 0 benchmarks
Semi-Supervised Text Classification
39 papers | 2 benchmarks
Sequential Pattern Mining
39 papers | 1 benchmarks
Sign Language Production
39 papers | 0 benchmarks
Spelling Correction
39 papers | 0 benchmarks
Text Infilling
39 papers | 0 benchmarks
Abstractive Text Summarization
38 papers | 18 benchmarks
Conversational Question Answering
38 papers | 1 benchmarks
coreference-resolution
38 papers | 0 benchmarks
Dialect Identification
38 papers | 0 benchmarks
Discourse Parsing
38 papers | 4 benchmarks
Discourse Segmentation
38 papers | 0 benchmarks
Document AI
38 papers | 1 benchmarks
Document Classification
38 papers | 21 benchmarks
Entity Alignment
38 papers | 10 benchmarks
Low Resource Named Entity Recognition
38 papers | 3 benchmarks
Self-Learning
38 papers | 0 benchmarks
Text Compression
38 papers | 0 benchmarks
Toxic Spans Detection
38 papers | 0 benchmarks
Emotion Recognition in Conversation
37 papers | 16 benchmarks
Implicit Discourse Relation Classification
37 papers | 0 benchmarks
Recipe Generation
37 papers | 5 benchmarks
Sentence-Pair Classification
37 papers | 0 benchmarks
Speech-to-Text Translation
37 papers | 10 benchmarks
Temporal Relation Extraction
37 papers | 1 benchmarks
Translation
37 papers | 7 benchmarks
Bias Detection
36 papers | 5 benchmarks
Hate Speech Detection
36 papers | 15 benchmarks
Headline Generation
36 papers | 1 benchmarks
Intent Classification
36 papers | 4 benchmarks
Intent Recognition
36 papers | 1 benchmarks
Language Modelling
36 papers | 55 benchmarks
Multilingual Named Entity Recognition
36 papers | 0 benchmarks
Multilingual NLP
36 papers | 0 benchmarks
Phrase Grounding
36 papers | 5 benchmarks
Question Generation
36 papers | 8 benchmarks
Attribute Value Extraction
35 papers | 4 benchmarks
Community Question Answering
35 papers | 2 benchmarks
Emotion Classification
35 papers | 9 benchmarks
Joint Entity and Relation Extraction
35 papers | 16 benchmarks
Query-focused Summarization
35 papers | 0 benchmarks
Text Style Transfer
35 papers | 2 benchmarks
NER
34 papers | 5 benchmarks
Column Type Annotation
34 papers | 12 benchmarks
Image Deblurring
34 papers | 9 benchmarks
Morphological Disambiguation
34 papers | 0 benchmarks
Open-Ended Question Answering
34 papers | 0 benchmarks
Question-Answer-Generation
34 papers | 0 benchmarks
Reverse Dictionary
34 papers | 0 benchmarks
Short Text Clustering
34 papers | 8 benchmarks
Temporal Information Extraction
34 papers | 2 benchmarks
Hypernym Discovery
33 papers | 3 benchmarks
Knowledge Base Question Answering
33 papers | 10 benchmarks
Language Identification
33 papers | 6 benchmarks
Long-range modeling
33 papers | 2 benchmarks
Low Resource NMT
33 papers | 0 benchmarks
Morphological Inflection
33 papers | 0 benchmarks
Native Language Identification
33 papers | 1 benchmarks
Code Repair
32 papers | 1 benchmarks
Document-level Event Extraction
32 papers | 1 benchmarks
Entity Resolution
32 papers | 11 benchmarks
Prepositional Phrase Attachment
32 papers | 0 benchmarks
Suggestion mining
32 papers | 0 benchmarks
Aspect Category Detection
31 papers | 4 benchmarks
Clickbait Detection
31 papers | 0 benchmarks
HellaSwag
31 papers | 0 benchmarks
Passage Re-Ranking
31 papers | 2 benchmarks
Table annotation
31 papers | 0 benchmarks
Cross-Lingual Natural Language Inference
30 papers | 4 benchmarks
Open-Domain Question Answering
30 papers | 15 benchmarks
Unsupervised Extractive Summarization
30 papers | 3 benchmarks
Word Sense Disambiguation
30 papers | 15 benchmarks
Dialogue Understanding
29 papers | 0 benchmarks
Keyphrase Generation
29 papers | 1 benchmarks
LAMBADA
29 papers | 1 benchmarks
Medical Named Entity Recognition
29 papers | 2 benchmarks
Natural Language Inference
29 papers | 37 benchmarks
Nested Named Entity Recognition
29 papers | 6 benchmarks
Part-Of-Speech Tagging
29 papers | 15 benchmarks
Reading Comprehension
29 papers | 7 benchmarks
Argument Mining
28 papers | 1 benchmarks
Coherence Evaluation
28 papers | 2 benchmarks
Implicatures
28 papers | 1 benchmarks
multimodal generation
28 papers | 1 benchmarks
News Generation
28 papers | 0 benchmarks
Stance Detection
28 papers | 22 benchmarks
Active Learning
27 papers | 1 benchmarks
Aggression Identification
27 papers | 0 benchmarks
Definition Extraction
27 papers | 0 benchmarks
Drug–drug Interaction Extraction
27 papers | 3 benchmarks
Emotion Cause Extraction
27 papers | 1 benchmarks
Entity Extraction using GAN
27 papers | 0 benchmarks
Entity Linking
27 papers | 27 benchmarks
Legal Reasoning
27 papers | 2 benchmarks
Mamba
27 papers | 0 benchmarks
Question Selection
27 papers | 1 benchmarks
Temporal Relation Classification
27 papers | 4 benchmarks
Toxic Comment Classification
27 papers | 4 benchmarks
Transliteration
27 papers | 0 benchmarks
Word Similarity
27 papers | 1 benchmarks
Decoder
26 papers | 0 benchmarks
Opinion Mining
26 papers | 1 benchmarks
Pretrained Multilingual Language Models
26 papers | 0 benchmarks
Question Rewriting
26 papers | 0 benchmarks
Table-based Fact Verification
26 papers | 1 benchmarks
Abstract Argumentation
25 papers | 0 benchmarks
Cross-Lingual Document Classification
25 papers | 10 benchmarks
Cross-Lingual Question Answering
25 papers | 3 benchmarks
Deep Learning
25 papers | 0 benchmarks
Diachronic Word Embeddings
25 papers | 0 benchmarks
Event Causality Identification
25 papers | 0 benchmarks
Low-Resource Neural Machine Translation
25 papers | 1 benchmarks
Protein Folding
25 papers | 0 benchmarks
Timeline Summarization
25 papers | 1 benchmarks
Automatic Post-Editing
24 papers | 0 benchmarks
CCG Supertagging
24 papers | 1 benchmarks
Coreference Resolution
24 papers | 16 benchmarks
Literature Mining
24 papers | 0 benchmarks
Method name prediction
24 papers | 1 benchmarks
Topic Models
24 papers | 6 benchmarks
Unsupervised Dependency Parsing
24 papers | 1 benchmarks
Chinese Named Entity Recognition
23 papers | 7 benchmarks
Emotional Intelligence
23 papers | 1 benchmarks
Few-Shot Relation Classification
23 papers | 4 benchmarks
Image to Video Generation
23 papers | 0 benchmarks
Semantic Retrieval
23 papers | 1 benchmarks
Taxonomy Expansion
23 papers | 0 benchmarks
Text-to-Image Generation
23 papers | 13 benchmarks
Text-To-SQL
23 papers | 10 benchmarks
Winogrande
23 papers | 0 benchmarks
Abuse Detection
22 papers | 0 benchmarks
Cross-Lingual Entity Linking
22 papers | 0 benchmarks
Data-free Knowledge Distillation
22 papers | 2 benchmarks
Dialog Act Classification
22 papers | 1 benchmarks
Extract Aspect
22 papers | 1 benchmarks
Extreme Summarization
22 papers | 4 benchmarks
Scientific Document Summarization
22 papers | 1 benchmarks
Short-Text Conversation
22 papers | 0 benchmarks
Table Retrieval
22 papers | 1 benchmarks
Text Retrieval
22 papers | 16 benchmarks
Word Translation
22 papers | 0 benchmarks
Cloze Test
21 papers | 2 benchmarks
Constituency Grammar Induction
21 papers | 1 benchmarks
Conversational Response Generation
21 papers | 0 benchmarks
Cross Document Coreference Resolution
21 papers | 0 benchmarks
KG-to-Text Generation
21 papers | 11 benchmarks
Large Language Model
21 papers | 2 benchmarks
Linguistic Acceptability
21 papers | 5 benchmarks
Opinion Summarization
21 papers | 0 benchmarks
Passage Ranking
21 papers | 1 benchmarks
Text Clustering
21 papers | 3 benchmarks
Zero-shot Slot Filling
21 papers | 3 benchmarks
Dependency Grammar Induction
20 papers | 2 benchmarks
Entity Typing
20 papers | 8 benchmarks
Intent Detection
20 papers | 19 benchmarks
Key Information Extraction
20 papers | 6 benchmarks
LLM-generated Text Detection
20 papers | 0 benchmarks
Paraphrase Identification
20 papers | 11 benchmarks
Probing Language Models
20 papers | 1 benchmarks
Specificity
20 papers | 0 benchmarks
Text Anonymization
20 papers | 0 benchmarks
Cross-Domain Named Entity Recognition
19 papers | 1 benchmarks
Dynamic Topic Modeling
19 papers | 0 benchmarks
Explanation Generation
19 papers | 5 benchmarks
Fine-Grained Opinion Analysis
19 papers | 1 benchmarks
Formality Style Transfer
19 papers | 1 benchmarks
Linguistic steganography
19 papers | 0 benchmarks
Low Resource Neural Machine Translation
19 papers | 0 benchmarks
Multi-Hop Reading Comprehension
19 papers | 0 benchmarks
Multi-Label Text Classification
19 papers | 20 benchmarks
News Classification
19 papers | 4 benchmarks
Relationship Extraction (Distant Supervised)
19 papers | 2 benchmarks
text annotation
19 papers | 0 benchmarks
Text-to-Video Generation
19 papers | 6 benchmarks
Toponym Resolution
19 papers | 0 benchmarks
XLM-R
19 papers | 0 benchmarks
Aspect Category Sentiment Analysis
18 papers | 1 benchmarks
Component Classification
18 papers | 1 benchmarks
Data-to-Text Generation
18 papers | 26 benchmarks
Event Relation Extraction
18 papers | 0 benchmarks
Language Acquisition
18 papers | 1 benchmarks
Story Generation
18 papers | 5 benchmarks
Answer Selection
17 papers | 6 benchmarks
Chinese Spell Checking
17 papers | 1 benchmarks
Complex Word Identification
17 papers | 0 benchmarks
Concept-To-Text Generation
17 papers | 1 benchmarks
De-identification
17 papers | 0 benchmarks
Gender Bias Detection
17 papers | 0 benchmarks
Memorization
17 papers | 1 benchmarks
nlg evaluation
17 papers | 0 benchmarks
POS Tagging
17 papers | 2 benchmarks
Semantic Role Labeling
17 papers | 7 benchmarks
Topic coverage
17 papers | 3 benchmarks
Vietnamese Datasets
17 papers | 0 benchmarks
Visual Dialog
17 papers | 8 benchmarks
Zero-Shot Stance Detection
17 papers | 0 benchmarks
AMR Parsing
16 papers | 8 benchmarks
Citation Intent Classification
16 papers | 2 benchmarks
Conditional Text Generation
16 papers | 1 benchmarks
Cross-Lingual Information Retrieval
16 papers | 0 benchmarks
Embeddings Evaluation
16 papers | 0 benchmarks
Fake News Detection
16 papers | 10 benchmarks
Keyword Extraction
16 papers | 3 benchmarks
Relational Reasoning
16 papers | 1 benchmarks
Semantic Textual Similarity
16 papers | 13 benchmarks
Story Completion
16 papers | 0 benchmarks
Table-to-Text Generation
16 papers | 8 benchmarks
Text Summarization
16 papers | 37 benchmarks
Transition-Based Dependency Parsing
16 papers | 0 benchmarks
Zero-Shot Text-to-Image Generation
16 papers | 0 benchmarks
Abstract Meaning Representation
15 papers | 0 benchmarks
Action Parsing
15 papers | 1 benchmarks
Aspect-Based Sentiment Analysis (ABSA)
15 papers | 18 benchmarks
Authorship Verification
15 papers | 0 benchmarks
Continual Relation Extraction
15 papers | 0 benchmarks
Dialogue Act Classification
15 papers | 5 benchmarks
Language Modeling
15 papers | 0 benchmarks
Machine Translation
15 papers | 83 benchmarks
PICO
15 papers | 1 benchmarks
Polyphone disambiguation
15 papers | 1 benchmarks
Prosody Prediction
15 papers | 1 benchmarks
Question Answering
15 papers | 149 benchmarks
Temporal Tagging
15 papers | 8 benchmarks
Aspect Term Extraction and Sentiment Classification
14 papers | 1 benchmarks
Cross-Domain Text Classification
14 papers | 0 benchmarks
Dialog Relation Extraction
14 papers | 2 benchmarks
Fact Selection
14 papers | 1 benchmarks
Implicit Relations
14 papers | 1 benchmarks
Key Point Matching
14 papers | 0 benchmarks
Profile Generation
14 papers | 1 benchmarks
Semantic entity labeling
14 papers | 2 benchmarks
Spam detection
14 papers | 1 benchmarks
Table-based Question Answering
14 papers | 0 benchmarks
Table Search
14 papers | 0 benchmarks
Text Generation
14 papers | 71 benchmarks
Automated Writing Evaluation
13 papers | 0 benchmarks
Cell Entity Annotation
13 papers | 5 benchmarks
Comment Generation
13 papers | 0 benchmarks
Commonsense Causal Reasoning
13 papers | 0 benchmarks
DRS Parsing
13 papers | 2 benchmarks
Extractive Summarization
13 papers | 0 benchmarks
Few-shot NER
13 papers | 4 benchmarks
Long-Context Understanding
13 papers | 5 benchmarks
Model Editing
13 papers | 0 benchmarks
Parallel Corpus Mining
13 papers | 0 benchmarks
Persian Sentiment Analysis
13 papers | 0 benchmarks
RAG
13 papers | 0 benchmarks
Text Classification
13 papers | 85 benchmarks
UCCA Parsing
13 papers | 2 benchmarks
Arabic Text Diacritization
12 papers | 2 benchmarks
Causal Emotion Entailment
12 papers | 1 benchmarks
Conversation Disentanglement
12 papers | 3 benchmarks
Humor Detection
12 papers | 1 benchmarks
Key-value Pair Extraction
12 papers | 2 benchmarks
Negation Scope Resolution
12 papers | 4 benchmarks
Predicate Detection
12 papers | 3 benchmarks
Relevance Detection
12 papers | 0 benchmarks
Sentence Pair Modeling
12 papers | 0 benchmarks
Session Search
12 papers | 0 benchmarks
Simultaneous Speech-to-Text Translation
12 papers | 0 benchmarks
Unsupervised Text Classification
12 papers | 4 benchmarks
Author Attribution
11 papers | 0 benchmarks
Columns Property Annotation
11 papers | 4 benchmarks
End-To-End Dialogue Modelling
11 papers | 2 benchmarks
Hint Generation
11 papers | 0 benchmarks
Mathematical Question Answering
11 papers | 2 benchmarks
Multiple Choice Question Answering (MCQA)
11 papers | 31 benchmarks
Nested Mention Recognition
11 papers | 2 benchmarks
Paper generation
11 papers | 2 benchmarks
Passage Retrieval
11 papers | 6 benchmarks
Question Similarity
11 papers | 1 benchmarks
Satire Detection
11 papers | 0 benchmarks
Subjectivity Analysis
11 papers | 2 benchmarks
Toponym Recognition
11 papers | 0 benchmarks
Vietnamese Word Segmentation
11 papers | 0 benchmarks
Zero-Shot Cross-Lingual Transfer
11 papers | 2 benchmarks
Zero-shot Named Entity Recognition (NER)
11 papers | 4 benchmarks
Abusive Language
10 papers | 0 benchmarks
Chunking
10 papers | 5 benchmarks
Cross-Lingual Semantic Textual Similarity
10 papers | 0 benchmarks
Document Ranking
10 papers | 2 benchmarks
Lay Summarization
10 papers | 2 benchmarks
Multi-modal Named Entity Recognition
10 papers | 5 benchmarks
Natural Language Understanding
10 papers | 6 benchmarks
Open-Domain Dialog
10 papers | 1 benchmarks
Semantic Composition
10 papers | 0 benchmarks
Semantic Shift Detection
10 papers | 0 benchmarks
Simultaneous Speech-to-Speech Translation
10 papers | 0 benchmarks
Only Connect Walls Dataset Task 1 (Grouping)
10 papers | 1 benchmarks
Text Simplification
10 papers | 11 benchmarks
Variable Detection
10 papers | 1 benchmarks
Zero-shot Event Extraction
10 papers | 0 benchmarks
AI Agent
9 papers | 0 benchmarks
answerability prediction
9 papers | 1 benchmarks
Binary Relation Extraction
9 papers | 2 benchmarks
Bridging Anaphora Resolution
9 papers | 0 benchmarks
Chinese Zero Pronoun Resolution
9 papers | 0 benchmarks
Connective Detection
9 papers | 0 benchmarks
Document Dating
9 papers | 2 benchmarks
Image-guided Story Ending Generation
9 papers | 2 benchmarks
molecular representation
9 papers | 0 benchmarks
Response Generation
9 papers | 3 benchmarks
Sentiment Analysis
9 papers | 42 benchmarks
Unsupervised Opinion Summarization
9 papers | 3 benchmarks
Vietnamese Social Media Text Processing
9 papers | 0 benchmarks
Author Profiling
8 papers | 0 benchmarks
Belebele
8 papers | 0 benchmarks
Bilingual Lexicon Induction
8 papers | 0 benchmarks
Cross-Lingual Word Embeddings
8 papers | 0 benchmarks
Definition Modelling
8 papers | 0 benchmarks
Dialog Learning
8 papers | 0 benchmarks
Emotion Recognition in Context
8 papers | 4 benchmarks
Grammatical Error Correction
8 papers | 13 benchmarks
Handwritten Chinese Text Recognition
8 papers | 0 benchmarks
Multi-agent Integration
8 papers | 1 benchmarks
Offline Handwritten Chinese Character Recognition
8 papers | 0 benchmarks
Paraphrase Generation
8 papers | 3 benchmarks
Sarcasm Detection
8 papers | 9 benchmarks
Spatial Reasoning
8 papers | 2 benchmarks
Summarization
8 papers | 12 benchmarks
target-oriented opinion words extraction
8 papers | 0 benchmarks
Thai Word Segmentation
8 papers | 2 benchmarks
Unsupervised Sentence Summarization
8 papers | 0 benchmarks
User Simulation
8 papers | 0 benchmarks
Vietnamese Hate Speech Detection
8 papers | 0 benchmarks
WNLI
8 papers | 0 benchmarks
Zero-Shot Machine Translation
8 papers | 0 benchmarks
Aspect-oriented Opinion Extraction
7 papers | 1 benchmarks
Code Documentation Generation
7 papers | 7 benchmarks
Contextualised Word Representations
7 papers | 0 benchmarks
Dialogue Rewriting
7 papers | 3 benchmarks
Few-Shot Stance Detection
7 papers | 0 benchmarks
Image Segmentation
7 papers | 12 benchmarks
Japanese Word Segmentation
7 papers | 1 benchmarks
Meme Classification
7 papers | 3 benchmarks
Occupation prediction
7 papers | 0 benchmarks
Open Intent Discovery
7 papers | 6 benchmarks
Privacy Preserving Deep Learning
7 papers | 0 benchmarks
Propaganda detection
7 papers | 0 benchmarks
Propaganda span identification
7 papers | 0 benchmarks
Query-Based Extractive Summarization
7 papers | 1 benchmarks
Slot Filling
7 papers | 14 benchmarks
SNARKS
7 papers | 0 benchmarks
Text Attribute Transfer
7 papers | 0 benchmarks
Timex normalization
7 papers | 2 benchmarks
Vietnamese Visual Question Answering
7 papers | 0 benchmarks
Word Sense Induction
7 papers | 1 benchmarks
Aspect-Category-Opinion-Sentiment Quadruple Extraction
6 papers | 2 benchmarks
Aspect Category Polarity
6 papers | 1 benchmarks
Cognate Prediction
6 papers | 0 benchmarks
Cross-Lingual Bitext Mining
6 papers | 4 benchmarks
Deep Attention
6 papers | 0 benchmarks
Equation Discovery
6 papers | 0 benchmarks
Fact Verification
6 papers | 3 benchmarks
Grounded language learning
6 papers | 0 benchmarks
Information Retrieval
6 papers | 34 benchmarks
Math Word Problem Solving
6 papers | 13 benchmarks
Mathematical Reasoning
6 papers | 11 benchmarks
Morpheme Segmentaiton
6 papers | 1 benchmarks
News Annotation
6 papers | 0 benchmarks
Open Intent Detection
6 papers | 17 benchmarks
Selection bias
6 papers | 0 benchmarks
Syntax Representation
6 papers | 0 benchmarks
Task-Completion Dialogue Policy Learning
6 papers | 0 benchmarks
Temporal/Casual QA
6 papers | 1 benchmarks
Term Extraction
6 papers | 2 benchmarks
text-to-Cypher
6 papers | 0 benchmarks
Vietnamese Language Models
6 papers | 0 benchmarks
Zero-shot Sentiment Classification
6 papers | 1 benchmarks
Argument Pair Extraction (APE)
5 papers | 1 benchmarks
Binary Condescension Detection
5 papers | 1 benchmarks
Continual Named Entity Recognition
5 papers | 0 benchmarks
Cross-Lingual Transfer
5 papers | 1 benchmarks
Dialogue Interpretation
5 papers | 0 benchmarks
Drug Design
5 papers | 0 benchmarks
DrugProt
5 papers | 1 benchmarks
Job classification
5 papers | 0 benchmarks
Job Prediction
5 papers | 0 benchmarks
Lexical Analysis
5 papers | 0 benchmarks
Long Form Question Answering
5 papers | 0 benchmarks
Multi-label Condescension Detection
5 papers | 1 benchmarks
Multimodal Machine Translation
5 papers | 3 benchmarks
Named Entity Recognition In Vietnamese
5 papers | 2 benchmarks
Personality Alignment
5 papers | 0 benchmarks
Reading Order Detection
5 papers | 2 benchmarks
Riddle Sense
5 papers | 2 benchmarks
Scientific Results Extraction
5 papers | 2 benchmarks
Stereotypical Bias Analysis
5 papers | 1 benchmarks
Text Effects Transfer
5 papers | 0 benchmarks
Unsupervised Part-Of-Speech Tagging
5 papers | 0 benchmarks
Vietnamese Image Captioning
5 papers | 0 benchmarks
Zero-shot Relation Triplet Extraction
5 papers | 2 benchmarks
Abstract Anaphora Resolution
4 papers | 1 benchmarks
Attribute Mining
4 papers | 3 benchmarks
Authorship Attribution
4 papers | 0 benchmarks
Bangla Spelling Error Correction
4 papers | 1 benchmarks
Chemical Indexing
4 papers | 1 benchmarks
Class-level Code Generation
4 papers | 1 benchmarks
Cross-lingual zero-shot dependency parsing
4 papers | 1 benchmarks
Chinese Spelling Error Correction
4 papers | 0 benchmarks
Document-level Relation Extraction
4 papers | 3 benchmarks
Emotional Dialogue Acts
4 papers | 0 benchmarks
Empirical Judgments
4 papers | 1 benchmarks
Extracting COVID-19 Events from Twitter
4 papers | 1 benchmarks
Face Selection
4 papers | 0 benchmarks
Goal-Oriented Dialog
4 papers | 1 benchmarks
Hope Speech Detection for Tamil
4 papers | 1 benchmarks
Information Threading
4 papers | 2 benchmarks
Instruction Following
4 papers | 1 benchmarks
Interactive Evaluation of Dialog
4 papers | 1 benchmarks
Joint Multilingual Sentence Representations
4 papers | 0 benchmarks
Logical Reasoning Question Answering
4 papers | 1 benchmarks
Logical Reasoning Reading Comprehension
4 papers | 0 benchmarks
Misogynistic Aggression Identification
4 papers | 0 benchmarks
Multimodal Attribute Value Extraction
4 papers | 0 benchmarks
Open Information Extraction
4 papers | 13 benchmarks
Page Stream Segmentation
4 papers | 0 benchmarks
Personality Generation
4 papers | 0 benchmarks
Reliable Intelligence Identification
4 papers | 0 benchmarks
Semantic Role Labeling (predicted predicates)
4 papers | 2 benchmarks
Speculation Detection
4 papers | 0 benchmarks
Text-Based Stock Prediction
4 papers | 0 benchmarks
Text-to-video search
4 papers | 0 benchmarks
Timedial
4 papers | 1 benchmarks
Twitter Event Detection
4 papers | 1 benchmarks
Unsupervised Sentence Compression
4 papers | 0 benchmarks
Unsupervised semantic parsing
4 papers | 2 benchmarks
Vietnamese Fact Checking
4 papers | 0 benchmarks
Vietnamese Speech Recognition
4 papers | 0 benchmarks
AI and Safety
3 papers | 0 benchmarks
Aspect Category Sentiment Classification
3 papers | 0 benchmarks
Aspect-Sentiment-Opinion Triplet Extraction
3 papers | 1 benchmarks
Constituency Parsing
3 papers | 4 benchmarks
Conversational Web Navigation
3 papers | 1 benchmarks
Dark Humor Detection
3 papers | 1 benchmarks
Data Mining
3 papers | 0 benchmarks
Dialogue Safety Prediction
3 papers | 2 benchmarks
Disambiguation QA
3 papers | 0 benchmarks
Discourse Marker Prediction
3 papers | 1 benchmarks
Domain Labelling
3 papers | 1 benchmarks
End-to-End RST Parsing
3 papers | 1 benchmarks
English Proverbs
3 papers | 1 benchmarks
Extract aspect-polarity tuple
3 papers | 1 benchmarks
Few-shot HTC
3 papers | 0 benchmarks
Formal Fallacies Syllogisms Negation
3 papers | 0 benchmarks
Hate Speech Normalization
3 papers | 0 benchmarks
Hyperbaton
3 papers | 0 benchmarks
image-sentence alignment
3 papers | 12 benchmarks
Information Extraction
3 papers | 1 benchmarks
KB-to-Language Generation
3 papers | 1 benchmarks
Meme Captioning
3 papers | 0 benchmarks
Memex Question Answering
3 papers | 1 benchmarks
Multi-modal Dialogue Generation
3 papers | 1 benchmarks
Negation Detection
3 papers | 0 benchmarks
Personality Recognition in Conversation
3 papers | 1 benchmarks
Phrase Ranking
3 papers | 2 benchmarks
Phrase Relatedness
3 papers | 1 benchmarks
Phrase Tagging
3 papers | 2 benchmarks
Political Salient Issue Orientation Detection
3 papers | 1 benchmarks
Poll Generation
3 papers | 1 benchmarks
Recognizing Emotion Cause in Conversations
3 papers | 2 benchmarks
Record linking
3 papers | 0 benchmarks
Relational Captioning
3 papers | 1 benchmarks
Ruin Names
3 papers | 0 benchmarks
Sentence Classification
3 papers | 6 benchmarks
Sentence Embeddings For Biomedical Texts
3 papers | 2 benchmarks
Social Media Mental Health Detection
3 papers | 0 benchmarks
Sonnet Generation
3 papers | 0 benchmarks
Speculation Scope Resolution
3 papers | 3 benchmarks
Turning Point Identification
3 papers | 0 benchmarks
Vietnamese Aspect-Based Sentiment Analysis
3 papers | 0 benchmarks
Vietnamese Natural Language Understanding
3 papers | 0 benchmarks
Vietnamese Scene Text
3 papers | 0 benchmarks
Vietnamese Sentiment Analysis
3 papers | 0 benchmarks
4-ary Relation Extraction
2 papers | 1 benchmarks
ArabicMMLU
2 papers | 0 benchmarks
Automatic Writing
2 papers | 0 benchmarks
Claim-Evidence Pair Extraction (CEPE)
2 papers | 1 benchmarks
Claim Extraction with Stance Classification (CESC)
2 papers | 1 benchmarks
Clinical Information Retreival
2 papers | 0 benchmarks
Clinical Language Translation
2 papers | 0 benchmarks
Clinical Section Identification
2 papers | 1 benchmarks
Collaborative Plan Acquisition
2 papers | 0 benchmarks
Context Query Reformulation
2 papers | 0 benchmarks
Croatian Text Diacritization
2 papers | 1 benchmarks
Cross-lingual Text-to-Image Generation
2 papers | 0 benchmarks
Czech Text Diacritization
2 papers | 1 benchmarks
Description-guided molecule generation
2 papers | 1 benchmarks
Document-level Closed Information Extraction
2 papers | 3 benchmarks
Document-level RE with incomplete labeling
2 papers | 2 benchmarks
Email Thread Summarization
2 papers | 2 benchmarks
Event-Driven Trading
2 papers | 0 benchmarks
Fantasy Reasoning
2 papers | 1 benchmarks
few-shot-htc
2 papers | 0 benchmarks
Figure Of Speech Detection
2 papers | 1 benchmarks
French Text Diacritization
2 papers | 1 benchmarks
GRE Reading Comprehension
2 papers | 1 benchmarks
Hate Span Identification
2 papers | 0 benchmarks
Hidden Aspect Detection
2 papers | 0 benchmarks
Hierarchical Text Classification of Blurbs (GermEval 2019)
2 papers | 1 benchmarks
Hierarchical Text Clustering
2 papers | 0 benchmarks
Hope Speech Detection for English
2 papers | 1 benchmarks
Hope Speech Detection for Malayalam
2 papers | 1 benchmarks
Hungarian Text Diacritization
2 papers | 1 benchmarks
Hyper-Relational Extraction
2 papers | 1 benchmarks
Image-to-Text Retrieval
2 papers | 8 benchmarks
incongruity detection
2 papers | 0 benchmarks
Intrusion Detection
2 papers | 5 benchmarks
Irish Text Diacritization
2 papers | 1 benchmarks
Irony Identification
2 papers | 1 benchmarks
Keyphrase Extraction
2 papers | 6 benchmarks
Latvian Text Diacritization
2 papers | 1 benchmarks
legal outcome extraction
2 papers | 0 benchmarks
Machine Reading Comprehension
2 papers | 4 benchmarks
Math Information Retrieval
2 papers | 1 benchmarks
Molecular description generation
2 papers | 0 benchmarks
Movie Dialog Same Or Different
2 papers | 1 benchmarks
Multi-Document Summarization
2 papers | 5 benchmarks
Multi-lingual Text-to-Image Generation
2 papers | 0 benchmarks
multilingual cross-modal retrieval
2 papers | 0 benchmarks
Multilingual Paraphrase Generation
2 papers | 0 benchmarks
Multimodal Abstractive Text Summarization
2 papers | 1 benchmarks
Multimodal Lexical Translation
2 papers | 4 benchmarks
Natural Language Transduction
2 papers | 0 benchmarks
Negation and Speculation Cue Detection
2 papers | 2 benchmarks
Negation and Speculation Scope resolution
2 papers | 0 benchmarks
Nonsense Words Grammar
2 papers | 1 benchmarks
Open Relation Modeling
2 papers | 0 benchmarks
Personalized and Emotional Conversation
2 papers | 1 benchmarks
Political evalutation
2 papers | 0 benchmarks
RACE-h
2 papers | 1 benchmarks
RACE-m
2 papers | 1 benchmarks
Reader-Aware Summarization
2 papers | 1 benchmarks
Role-filler Entity Extraction
2 papers | 1 benchmarks
Romanian Text Diacritization
2 papers | 1 benchmarks
Scientific Concept Extraction
2 papers | 1 benchmarks
Semantic Similarity
2 papers | 26 benchmarks
SemEval-2022 Task 4-1 (Binary PCL Detection)
2 papers | 1 benchmarks
SemEval-2022 Task 4-2 (Multi-label PCL Detection)
2 papers | 1 benchmarks
Semi-Supervised Text Regression
2 papers | 0 benchmarks
Sensitivity Classification
2 papers | 1 benchmarks
Sentiment Dependency Learning
2 papers | 0 benchmarks
Sketch-to-text Generation
2 papers | 0 benchmarks
Slovak Text Diacritization
2 papers | 1 benchmarks
Spanish Text Diacritization
2 papers | 1 benchmarks
SSTOD
2 papers | 2 benchmarks
Task-Oriented Dialogue Systems
2 papers | 4 benchmarks
Text Matching
2 papers | 0 benchmarks
Text-to-GQL
2 papers | 0 benchmarks
Text-Variation
2 papers | 0 benchmarks
Textual Analogy Parsing
2 papers | 0 benchmarks
True or False Question Answering
2 papers | 0 benchmarks
trustable and focussed LLM generated content
2 papers | 0 benchmarks
Turkish Text Diacritization
2 papers | 1 benchmarks
Understanding Fables
2 papers | 1 benchmarks
Unsupervised KG-to-Text Generation
2 papers | 4 benchmarks
Unsupervised Machine Translation
2 papers | 9 benchmarks
ValNov
2 papers | 2 benchmarks
Vietnamese Parsing
2 papers | 0 benchmarks
Vietnamese Text Diacritization
2 papers | 1 benchmarks
Visual Commonsense Tests
2 papers | 1 benchmarks
Workflow Discovery
2 papers | 1 benchmarks
Alignement visualisation
1 papers | 0 benchmarks
Anaphora Resolution
1 papers | 0 benchmarks
ARQMath2
1 papers | 0 benchmarks
Aspect Sentiment Triplet Extraction
1 papers | 4 benchmarks
Bangla Text Detection
1 papers | 1 benchmarks
Blackout Poetry Generation
1 papers | 1 benchmarks
Catalog Extraction
1 papers | 1 benchmarks
Cause-Effect Relation Classification
1 papers | 0 benchmarks
Chinese
1 papers | 0 benchmarks
Clinical Assertion Status Detection
1 papers | 1 benchmarks
Coding Problem Tagging
1 papers | 0 benchmarks
Commonsense Reasoning for RL
1 papers | 1 benchmarks
Complaint Comment Classification
1 papers | 0 benchmarks
Context-specific Spam Detection
1 papers | 1 benchmarks
Contextualized Literature-based Discovery
1 papers | 0 benchmarks
Controllable Language Modelling
1 papers | 0 benchmarks
Conversational Sentiment Quadruple Extraction
1 papers | 2 benchmarks
Counterspeech Detection
1 papers | 1 benchmarks
Cross-Document Language Modeling
1 papers | 2 benchmarks
Cross-Language Text Summarization
1 papers | 0 benchmarks
Cross-Lingual
1 papers | 0 benchmarks
Crowdsourced Text Aggregation
1 papers | 2 benchmarks
Detection of potentially void clauses
1 papers | 1 benchmarks
Dialogue
1 papers | 1 benchmarks
Direct NMT
1 papers | 0 benchmarks
Emergent communications on relations
1 papers | 0 benchmarks
Emotion Detection and Trigger Summarization
1 papers | 0 benchmarks
Entity Typing on DH-KGs
1 papers | 0 benchmarks
Extractive Tags Summarization
1 papers | 0 benchmarks
Fact-based Text Editing
1 papers | 2 benchmarks
FG-1-PG-1
1 papers | 3 benchmarks
Figurative Language Visualization
1 papers | 0 benchmarks
Genetic IE
1 papers | 0 benchmarks
GermEval2024 Shared Task 1 Subtask 1
1 papers | 1 benchmarks
GermEval2024 Shared Task 1 Subtask 2
1 papers | 1 benchmarks
Grapheme Detection
1 papers | 0 benchmarks
Grounded Open Vocabulary Acquisition
1 papers | 0 benchmarks
Hate Intensity Prediction
1 papers | 0 benchmarks
Hate Speech Detection CrisisHateMM Benchmark
1 papers | 0 benchmarks
Hurtful Sentence Completion
1 papers | 1 benchmarks
Joint Entity and Relation Extraction on Scientific Data
1 papers | 0 benchmarks
Joint NER and Classification
1 papers | 0 benchmarks
Latent Aspect Detection
1 papers | 0 benchmarks
Legal Document Translation
1 papers | 0 benchmarks
Line Items Extraction
1 papers | 0 benchmarks
Link prediction on DH-KGs
1 papers | 1 benchmarks
Medical question pair similarity computation
1 papers | 0 benchmarks
Meeting Summarization
1 papers | 2 benchmarks
Metric-Type Identification
1 papers | 0 benchmarks
MMSQL performance
1 papers | 1 benchmarks
Morphological Analysis
1 papers | 0 benchmarks
Multi-Dialect Vietnamese
1 papers | 0 benchmarks
Multi-Grained Named Entity Recognition
1 papers | 0 benchmarks
Multi-Labeled Relation Extraction
1 papers | 0 benchmarks
multi-word expression embedding
1 papers | 0 benchmarks
multi-word expression sememe prediction
1 papers | 0 benchmarks
Multilingual Machine Comprehension in English Hindi
1 papers | 1 benchmarks
Multimedia Generative Script Learning
1 papers | 0 benchmarks
Multimodal GIF Dialog
1 papers | 1 benchmarks
Multimodal Text Prediction
1 papers | 1 benchmarks
Multiview Contextual Commonsense Inference
1 papers | 2 benchmarks
Multlingual Neural Machine Translation
1 papers | 0 benchmarks
Natural Language Landmark Navigation Instructions Generation
1 papers | 1 benchmarks
Open-World Social Event Classification
1 papers | 0 benchmarks
Overlapping Mention Recognition
1 papers | 0 benchmarks
Pcl Detection
1 papers | 0 benchmarks
Persona Dialogue in Story
1 papers | 1 benchmarks
Phrase Vector Embedding
1 papers | 0 benchmarks
Poem meters classification
1 papers | 1 benchmarks
Problem-Solving Deliberation
1 papers | 1 benchmarks
Pronunciation Dictionary Creation
1 papers | 0 benchmarks
Propaganda technique identification
1 papers | 0 benchmarks
quantum circuit classification (classical ML)
1 papers | 0 benchmarks
Query Wellformedness
1 papers | 1 benchmarks
Question-Answer categorization
1 papers | 1 benchmarks
Question Quality Assessment
1 papers | 2 benchmarks
Question to Declarative Sentence
1 papers | 0 benchmarks
Readability optimization
1 papers | 0 benchmarks
relation explanation
1 papers | 0 benchmarks
Relation Mention Extraction
1 papers | 0 benchmarks
Row Annotation
1 papers | 1 benchmarks
Rules-of-thumb Generation
1 papers | 0 benchmarks
Semi-Supervised Formality Style Transfer
1 papers | 0 benchmarks
Speaker Attribution in German Parliamentary Debates (GermEval 2023, subtask 1)
1 papers | 1 benchmarks
Stance Detection (US Election 2020 - Biden)
1 papers | 1 benchmarks
Stance Detection (US Election 2020 - Trump)
1 papers | 1 benchmarks
Summarization Consistency Evaluation
1 papers | 1 benchmarks
Table Type Detection
1 papers | 1 benchmarks
Only Connect Walls Dataset Task 2 (Connections)
1 papers | 0 benchmarks
Taxonomy Learning
1 papers | 0 benchmarks
Text-to-CQL
1 papers | 0 benchmarks
Traditional Spam Detection
1 papers | 1 benchmarks
Tweet-Reply Sentiment Analysis
1 papers | 1 benchmarks
Variable Disambiguation
1 papers | 1 benchmarks
Vietnamese Lexical Normalization
1 papers | 0 benchmarks
Vietnamese Multimodal Sentiment Analysis
1 papers | 0 benchmarks
Visual Storytelling
1 papers | 1 benchmarks
Weakly Supervised Data Denoising
1 papers | 0 benchmarks
Web Page Tagging
1 papers | 0 benchmarks
Word Attribute Transfer
1 papers | 0 benchmarks
Zero-Shot Out-of-Domain Detection
1 papers | 0 benchmarks