Fact Verification On Kilt Fever
評価指標
Accuracy
KILT-AC
R-Prec
Recall@5
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
モデル名 | Accuracy | KILT-AC | R-Prec | Recall@5 | Paper Title | Repository |
|---|---|---|---|---|---|---|
| KGI | 85.58 | 64.41 | 75.6 | 84.95 | - | - |
| intersect | 89.54 | 71.28 | 81.45 | 89.56 | - | - |
| NSMN | 66.1 | 41.88 | 49.24 | 70.16 | - | - |
| QDA_EMB2 | 69.41 | 0.0 | 0.0 | 0.0 | - | - |
| SVM | 70.71 | 0.0 | 0.0 | 0.0 | - | - |
| galimaldo | 12.57 | 0.0 | 0.0 | 0.0 | - | - |
| BART | 78.93 | 0.0 | 0.0 | 0.0 | - | - |
| Alessandro_Tansel | 71.42 | 0.0 | 0.0 | 0.0 | - | - |
| Wikipedia | 88.99 | 65.68 | 74.77 | 87.89 | - | - |
| SVM with rbf kernel | 72.34 | 0.0 | 0.0 | 0.0 | - | - |
| T5-base | 76.3 | 0.0 | 0.0 | 0.0 | KILT: a Benchmark for Knowledge Intensive Language Tasks | |
| RAG | 86.31 | 53.45 | 61.94 | 75.55 | KILT: a Benchmark for Knowledge Intensive Language Tasks | |
| JuanTran | 71.38 | 0.0 | 0.0 | 0.0 | - | - |
| BERT + DPR | 69.68 | 58.58 | 72.93 | 73.52 | - | - |
| Re2G | 89.55 | 78.53 | 88.92 | 92.52 | Re2G: Retrieve, Rerank, Generate | |
| LogisticRegression | 23.01 | 0.0 | 0.0 | 0.0 | - | - |
| BART + DPR | 86.74 | 47.68 | 55.33 | 74.29 | - | - |
| QDA | 71.12 | 0.0 | 0.0 | 0.0 | - | - |
| Multitask DPR + BART | 86.32 | 63.94 | 74.48 | 87.52 | - | - |
| Multi-task DPR | 0.0 | 0.0 | 74.48 | 87.52 | - | - |
0 of 33 row(s) selected.