Question Answering On Quality
평가 지표
Accuracy
평가 결과
이 벤치마크에서 각 모델의 성능 결과
모델 이름 | Accuracy | Paper Title | Repository |
---|---|---|---|
Claude Instant 1.1 (5-shot) | 80.5 | Model Card and Evaluations for Claude Models | - |
Claude 1.3 (5-shot) | 84.1 | Model Card and Evaluations for Claude Models | - |
Claude 2 (5-shot) | 83.2 | Model Card and Evaluations for Claude Models | - |
RAPTOR + GPT-4 (June 2023) | 82.6 | RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval |
0 of 4 row(s) selected.