Home News Papers Tutorials Datasets Wiki SOTA LLM Models GPU Leaderboard Events

English

Interactive Evaluation Of Dialog On Dstc9

Metrics

Coherent

Consistent

Diversity

Error Recovery

Flexible

Informative

Inquisitive

Likeable

Overall Human Rating

Topic Depth

Understanding

Results

Performance results of various models on this benchmark

Model Name	Coherent	Consistent	Diversity	Error Recovery	Flexible	Informative	Inquisitive	Likeable	Overall Human Rating	Topic Depth	Understanding	Paper Title	Repository
PLATO-2	2.8017	0.9390	2.7441	2.7518	2.8000	2.7881	2.7949	2.7878	4.15	2.7678	2.8285	A Unified Pre-training Framework for Conversational AI

0 of 1 row(s) selected.