Command Palette
Search for a command to run...
Text Classification On This Is Not A Dataset
評価指標
Accuracy
Coherence
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
| Paper Title | |||
|---|---|---|---|
| Vicuna13B v1.1 | 95.7 | 81.2 | This is not a Dataset: A Large Negation Benchmark to Challenge Large Language Models |
| Flan-T5-xxl | 94.1 | 51.8 | This is not a Dataset: A Large Negation Benchmark to Challenge Large Language Models |
0 of 2 row(s) selected.