Slot Filling On Kilt T Rex
评估指标
Accuracy
F1
KILT-AC
KILT-F1
R-Prec
Recall@5
评测结果
各个模型在此基准测试上的表现结果
模型名称 | Accuracy | F1 | KILT-AC | KILT-F1 | R-Prec | Recall@5 | Paper Title | Repository |
---|---|---|---|---|---|---|---|---|
multi-task small | 19.3 | 25.81 | 0.0 | 0.0 | 0.0 | 0.0 | - | - |
Multi-task DPR | 0.0 | 0.0 | 0.0 | 0.0 | 69.46 | 83.88 | - | - |
KGI_1 | 84.36 | 87.24 | 69.14 | 70.58 | 74.36 | 83.14 | - | - |
BART | 45.06 | 49.24 | 0.0 | 0.0 | 0.0 | 0.0 | - | - |
RAG | 59.2 | 62.96 | 23.12 | 23.94 | 28.68 | 33.04 | - | - |
MetaRAG | 78.66 | 81.71 | 61.88 | 63.09 | 66.36 | 76.24 | - | - |
GENRE | 0.1 | 7.67 | 0.04 | 6.66 | 79.42 | 85.33 | - | - |
TABi | 0.0 | 0.0 | 0.0 | 0.0 | 81.9 | 89.36 | - | - |
Sphere | 57.02 | 61.46 | 0.0 | 0.0 | 0.0 | 0.0 | - | - |
single ngram | 83.72 | 86.53 | 60.08 | 61.72 | 67.8 | 81.52 | - | - |
JivBest | 0.02 | 2.04 | 0.0 | 0.0 | 0.0 | 0.0 | - | - |
Re2G | 87.68 | 89.93 | 75.84 | 77.05 | 80.7 | 89.0 | Re2G: Retrieve, Rerank, Generate | |
KGI_0 (reupload) | 77.9 | 81.31 | 55.54 | 56.79 | 59.7 | 70.38 | - | - |
chriskuei | 0.0 | 0.0 | 0.0 | 0.0 | 79.98 | 85.75 | - | - |
Coop. DistilBert | 49.04 | 54.62 | 36.68 | 39.57 | 48.08 | 51.86 | - | - |
T5-base | 43.56 | 50.61 | 0.0 | 0.0 | 0.0 | 0.0 | KILT: a Benchmark for Knowledge Intensive Language Tasks | |
BART + DPR | 59.16 | 62.76 | 11.12 | 11.41 | 13.26 | 17.04 | - | - |
10k | 53.9 | 61.74 | 27.84 | 32.34 | 37.62 | 40.07 | - | - |
Wikipedia | 81.34 | 84.46 | 64.64 | 66.64 | 75.64 | 87.57 | - | - |
DensePhrases | 53.9 | 61.74 | 27.84 | 32.34 | 37.62 | 40.07 | Learning Dense Representations of Phrases at Scale |
0 of 20 row(s) selected.