HyperAI초신경

Interactive Evaluation Of Dialog On Dstc9

평가 지표

Coherent
Consistent
Diversity
Error Recovery
Flexible
Informative
Inquisitive
Likeable
Overall Human Rating
Topic Depth
Understanding

평가 결과

이 벤치마크에서 각 모델의 성능 결과

비교 표
모델 이름CoherentConsistentDiversityError RecoveryFlexibleInformativeInquisitiveLikeableOverall Human RatingTopic DepthUnderstanding
a-unified-pre-training-framework-for2.80170.93902.7441 2.75182.80002.78812.79492.78784.152.76782.8285