Natural Language Understanding On Dialoglue
Metrics
Average
Banking77 (Acc)
CLINC150 (Acc)
DSTC8 (F-1)
HWU64 (Acc)
MultiWOZ (Joint Goal Acc)
Restaurant8k (F-1)
TOP (EM)
Results
Performance results of various models on this benchmark
Comparison Table
Model Name | Average | Banking77 (Acc) | CLINC150 (Acc) | DSTC8 (F-1) | HWU64 (Acc) | MultiWOZ (Joint Goal Acc) | Restaurant8k (F-1) | TOP (EM) |
---|---|---|---|---|---|---|---|---|
Model 1 | 85.83 | 91.17 | 95.8 | 88.33 | 91.36 | 58.22 | 94.85 | 81.1 |
Model 2 | 86.89 | 93.44 | 92.38 | 91.2 | 97.11 | 56.56 | 95.44 | 82.08 |
Model 3 | 85.34 | 92.99 | 91.82 | 86.49 | 97.11 | 58.29 | 94.34 | 76.36 |