Dialogue Evaluation On Usr Topicalchat
Métriques
Pearson Correlation
Spearman Correlation
Résultats
Résultats de performance de divers modèles sur ce benchmark
Nom du modèle | Pearson Correlation | Spearman Correlation | Paper Title | Repository |
---|---|---|---|---|
USR | 0.4220 | 0.4192 | USR: An Unsupervised and Reference Free Evaluation Metric for Dialog Generation | |
Lin-Reg (all) | 0.4974 | 0.4877 | Proxy Indicators for the Quality of Open-domain Dialogues | - |
MDD-Eval | 0.4575 | 0.5109 | MDD-Eval: Self-Training on Augmented Data for Multi-Domain Dialogue Evaluation | |
USR - DR (x = c) | 0.4068 | 0.3245 | USR: An Unsupervised and Reference Free Evaluation Metric for Dialog Generation | |
USR - DR (x = f) | 0.3221 | 0.1419 | USR: An Unsupervised and Reference Free Evaluation Metric for Dialog Generation | |
USR - MLM | 0.3345 | 0.3086 | USR: An Unsupervised and Reference Free Evaluation Metric for Dialog Generation |
0 of 6 row(s) selected.