Semantic Textual Similarity On Senteval
评估指标
MRPC
SICK-E
SICK-R
STS
评测结果
各个模型在此基准测试上的表现结果
模型名称 | MRPC | SICK-E | SICK-R | STS | Paper Title | Repository |
---|---|---|---|---|---|---|
Snorkel MeTaL(ensemble) | 91.5/88.5 | - | - | 90.1/89.7* | Training Complex Models with Multi-Task Weak Supervision | |
XLNet-Large | 93.0/90.7 | - | - | 91.6/91.1* | XLNet: Generalized Autoregressive Pretraining for Language Understanding | |
GenSen | 78.6/84.4 | 87.8 | 0.888 | 78.9/78.6 | Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning | |
TF-KLD | 80.4/85.9 | - | - | - | - | - |
MT-DNN-ensemble | 92.7/90.3 | - | - | 91.1/90.7* | Improving Multi-Task Deep Neural Networks via Knowledge Distillation for Natural Language Understanding | |
InferSent | 76.2/83.1 | 86.3 | 0.884 | 75.8/75.5 | Supervised Learning of Universal Sentence Representations from Natural Language Inference Data |
0 of 6 row(s) selected.