LLaMA 65B (few-shot, k=64) | 73.0 | LLaMA: Open and Efficient Foundation Language Models | |
GaC(Qwen2-72B-Instruct + Llama-3-70B-Instruct) | 79.29 | Breaking the Ceiling of the LLM Community by Treating Token Generation as a Classification for Ensembling | - |
RankRAG-llama3-8b (Zero-Shot, KILT) | 82.9 | RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs | - |
RankRAG-llama3-70b (Zero-Shot, KILT) | 86.5 | RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs | - |
Claude 2 (few-shot, k=5) | 87.5 | Model Card and Evaluations for Claude Models | - |
Claude Instant 1.1 (few-shot, k=5) | 78.9 | Model Card and Evaluations for Claude Models | - |