Hybrid H3 125M (3-shot, logit scoring) | 48.9 | Hungry Hungry Hippos: Towards Language Modeling with State Space Models | |
Hybrid H3 355M (0-shot, logit scoring) | 59.5 | Hungry Hungry Hippos: Towards Language Modeling with State Space Models | |
GPT-3 175B (Few-Shot) | - | Language Models are Few-Shot Learners | |
Hybrid H3 355M (3-shot, logit scoring) | 59.7 | Hungry Hungry Hippos: Towards Language Modeling with State Space Models | |
PaLM 540B (finetuned) | 69.2 | PaLM: Scaling Language Modeling with Pathways | |
Hybrid H3 125M (0-shot, logit scoring) | 51.4 | Hungry Hungry Hippos: Towards Language Modeling with State Space Models | |