HyperAI

Interactive Evaluation Of Dialog On Dstc9

Metriken

Coherent
Consistent
Diversity
Error Recovery
Flexible
Informative
Inquisitive
Likeable
Overall Human Rating
Topic Depth
Understanding

Ergebnisse

Leistungsergebnisse verschiedener Modelle zu diesem Benchmark

Vergleichstabelle
ModellnameCoherentConsistentDiversityError RecoveryFlexibleInformativeInquisitiveLikeableOverall Human RatingTopic DepthUnderstanding
a-unified-pre-training-framework-for2.80170.93902.7441 2.75182.80002.78812.79492.78784.152.76782.8285