Claude 2 (few-shot, k=5) | 91 | Model Card and Evaluations for Claude Models | - |
PaLM 540B (Self Improvement, CoT Prompting) | 88.3 | Large Language Models Can Self-Improve | - |
LLaMA 3 8B + MoSLoRA (fine-tuned) | 81.5 | Mixture-of-Subspaces in Low-Rank Adaptation | |
PaLM 540B (CoT Prompting) | 85.2 | Large Language Models Can Self-Improve | - |
PaLM 540B (Standard-Prompting) | 87.1 | Large Language Models Can Self-Improve | - |
ST-MoE-32B 269B (fine-tuned) | 86.5 | ST-MoE: Designing Stable and Transferable Sparse Expert Models | - |
PaLM 2 (few-shot, CoT, SC) | 95.1 | PaLM 2 Technical Report | |