RoBERTa-large 355M + Entailment as Few-shot Learner | 91.0 | Entailment as Few-Shot Learner | |
RoBERTa-large 355M (MLP quantized vector-wise, fine-tuned) | - | LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale | |
Q8BERT (Zafrir et al., 2019) | - | Q8BERT: Quantized 8Bit BERT | |