Conversational Response Selection On Douban 1
평가 지표
MAP
MRR
P@1
R10@1
R10@2
R10@5
평가 결과
이 벤치마크에서 각 모델의 성능 결과
비교 표
모델 이름 | MAP | MRR | P@1 | R10@1 | R10@2 | R10@5 |
---|---|---|---|---|---|---|
190501969 | 0.608 | 0.650 | 0.475 | 0.299 | 0.494 | 0.822 |
modeling-multi-turn-conversation-with-deep | 0.551 | 0.599 | 0.421 | 0.243 | 0.421 | 0.780 |
interactive-matching-network-for-multi-turn | 0.570 | 0.615 | 0.433 | 0.262 | 0.452 | 0.789 |
fine-grained-post-training-for-improving | 0.644 | 0.680 | 0.512 | 0.324 | 0.542 | 0.870 |
domain-adaptive-training-bert-for-response | 0.591 | 0.633 | 0.454 | 0.280 | 0.470 | 0.828 |
sequential-matching-network-a-new | 0.529 | 0.569 | 0.397 | 0.233 | 0.396 | 0.724 |
knowledge-aware-response-selection-with | 0.640 | 0.678 | 0.511 | 0.330 | 0.520 | 0.870 |
dialogue-response-selection-with-hierarchical | 0.639 | 0.681 | 0.514 | 0.330 | 0.531 | 0.858 |
do-response-selection-models-really-know-what | 0.625 | 0.664 | 0.499 | 0.318 | 0.482 | 0.858 |
speaker-aware-bert-for-multi-turn-response | 0.619 | 0.659 | 0.496 | 0.313 | 0.481 | 0.847 |
one-time-of-interaction-may-not-be-enough-go | 0.573 | 0.621 | 0.444 | 0.269 | 0.451 | 0.786 |
multi-hop-selector-network-for-multi-turn | 0.587 | 0.632 | 0.470 | 0.295 | 0.452 | 0.788 |
knowledge-aware-response-selection-with | 0.651 | 0.687 | 0.510 | 0.328 | 0.552 | 0.877 |
global-selector-a-new-benchmark-dataset-and | 0.622 | 0.662 | 0.481 | 0.303 | 0.514 | 0.852 |
multi-turn-response-selection-for-chatbots | 0.550 | 0.601 | 0.427 | 0.254 | 0.410 | 0.757 |
global-selector-a-new-benchmark-dataset-and | 0.648 | 0.688 | 0.518 | 0.327 | 0.557 | 0.865 |