Fact Verification On Kilt Fever
评估指标
Accuracy
KILT-AC
R-Prec
Recall@5
评测结果
各个模型在此基准测试上的表现结果
模型名称 | Accuracy | KILT-AC | R-Prec | Recall@5 | Paper Title | Repository |
---|---|---|---|---|---|---|
KGI | 85.58 | 64.41 | 75.6 | 84.95 | - | - |
intersect | 89.54 | 71.28 | 81.45 | 89.56 | - | - |
NSMN | 66.1 | 41.88 | 49.24 | 70.16 | - | - |
QDA_EMB2 | 69.41 | 0.0 | 0.0 | 0.0 | - | - |
SVM | 70.71 | 0.0 | 0.0 | 0.0 | - | - |
galimaldo | 12.57 | 0.0 | 0.0 | 0.0 | - | - |
BART | 78.93 | 0.0 | 0.0 | 0.0 | - | - |
Alessandro_Tansel | 71.42 | 0.0 | 0.0 | 0.0 | - | - |
Wikipedia | 88.99 | 65.68 | 74.77 | 87.89 | - | - |
SVM with rbf kernel | 72.34 | 0.0 | 0.0 | 0.0 | - | - |
T5-base | 76.3 | 0.0 | 0.0 | 0.0 | KILT: a Benchmark for Knowledge Intensive Language Tasks | |
RAG | 86.31 | 53.45 | 61.94 | 75.55 | KILT: a Benchmark for Knowledge Intensive Language Tasks | |
JuanTran | 71.38 | 0.0 | 0.0 | 0.0 | - | - |
BERT + DPR | 69.68 | 58.58 | 72.93 | 73.52 | - | - |
Re2G | 89.55 | 78.53 | 88.92 | 92.52 | Re2G: Retrieve, Rerank, Generate | |
LogisticRegression | 23.01 | 0.0 | 0.0 | 0.0 | - | - |
BART + DPR | 86.74 | 47.68 | 55.33 | 74.29 | - | - |
QDA | 71.12 | 0.0 | 0.0 | 0.0 | - | - |
Multitask DPR + BART | 86.32 | 63.94 | 74.48 | 87.52 | - | - |
Multi-task DPR | 0.0 | 0.0 | 74.48 | 87.52 | - | - |
0 of 33 row(s) selected.