BERT-large 340M (fine-tuned on WSCR) | 71.9 | A Surprisingly Robust Trick for Winograd Schema Challenge | |
FLAN 137B (few-shot, k=4) | 70.4 | Finetuned Language Models Are Zero-Shot Learners | |
BERTwiki 340M (fine-tuned on WSCR) | 74.7 | A Surprisingly Robust Trick for Winograd Schema Challenge | |
Turing NLR v5 XXL 5.4B (fine-tuned) | 95.9 | - | - |