HyperAI超神経

Interactive Evaluation Of Dialog On Dstc9

評価指標

Coherent
Consistent
Diversity
Error Recovery
Flexible
Informative
Inquisitive
Likeable
Overall Human Rating
Topic Depth
Understanding

評価結果

このベンチマークにおける各モデルのパフォーマンス結果

比較表
モデル名CoherentConsistentDiversityError RecoveryFlexibleInformativeInquisitiveLikeableOverall Human RatingTopic DepthUnderstanding
a-unified-pre-training-framework-for2.80170.93902.7441 2.75182.80002.78812.79492.78784.152.76782.8285