Natural Language Understanding On Lexglue
評価指標
CaseHOLD
ECtHR Task A
ECtHR Task B
EUR-LEX
LEDGAR
SCOTUS
UNFAIR-ToS
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
比較表
モデル名 | CaseHOLD | ECtHR Task A | ECtHR Task B | EUR-LEX | LEDGAR | SCOTUS | UNFAIR-ToS |
---|---|---|---|---|---|---|---|
lexglue-a-benchmark-dataset-for-legal | 75.6 | 71.2 / 64.2 | 88.0 / 77.5 | 71.0 / 55.9 | 88.0 / 82.3 | 76.4 / 66.2 | 88.3 / 81.0 |
lexglue-a-benchmark-dataset-for-legal | 71.7 | 69.5 / 60.7 | 87.2 / 77.3 | 71.8 / 57.5 | 87.9 / 82.1 | 70.8 / 61.2 | 87.7 / 81.5 |
lexglue-a-benchmark-dataset-for-legal | 70.7 | 71.4 / 64.0 | 87.6 / 77.8 | 71.6 / 55.6 | 87.7 / 82.2 | 70.5 / 60.9 | 87.5 / 81.0 |
the-unreasonable-effectiveness-of-the | - | 66.3 / 55.0 | 76.0 / 65.4 | 65.7 / 49.0 | 88.0 / 82.6 | 74.4 / 64.5 | - |
lexglue-a-benchmark-dataset-for-legal | 72.1 | 69.1 / 61.2 | 87.4 / 77.3 | 72.3 / 57.2 | 87.9 / 82.0 | 70.0 / 60.0 | 87.2 / 78.8 |
lexglue-a-benchmark-dataset-for-legal | 72.0 | 69.6 / 62.4 | 88.0 / 77.8 | 71.9 / 56.7 | 87.7 / 82.3 | 72.2 / 62.5 | 87.7 / 80.1 |
lexglue-a-benchmark-dataset-for-legal | 75.1 | 71.2 / 64.6 | 88.0 / 77.2 | 72.2 / 56.2 | 88.1 / 82.7 | 76.2 / 65.8 | 88.6 / 82.3 |
lexglue-a-benchmark-dataset-for-legal | 70.4 | 70.5 / 63.8 | 88.1 / 76.6 | 71.8 / 56.6 | 87.7 / 82.1 | 71.7 / 61.4 | 87.7 / 80.2 |