Text Classification On This Is Not A Dataset
평가 지표
Accuracy
Coherence
평가 결과
이 벤치마크에서 각 모델의 성능 결과
모델 이름 | Accuracy | Coherence | Paper Title | Repository |
---|---|---|---|---|
Vicuna13B v1.1 | 95.7 | 81.2 | This is not a Dataset: A Large Negation Benchmark to Challenge Large Language Models | |
Flan-T5-xxl | 94.1 | 51.8 | This is not a Dataset: A Large Negation Benchmark to Challenge Large Language Models |
0 of 2 row(s) selected.