HyperAI

Task 1 Grouping On Ocw

Métriques

Wasserstein Distance (WD)
# Correct Groups
# Solved Walls
Adjusted Mutual Information (AMI)
Adjusted Rand Index (ARI)
Fowlkes Mallows Score (FMS)

Résultats

Résultats de performance de divers modèles sur ce benchmark

Tableau comparatif
Nom du modèle Wasserstein Distance (WD)# Correct Groups# Solved WallsAdjusted Mutual Information (AMI)Adjusted Rand Index (ARI)Fowlkes Mallows Score (FMS)
pre-training-of-deep-bidirectional-protein89.5 ± .422 ± 20 ± 08.1 ± .46.4 ± .325.1 ± .2
gpt-4-technical-report-182.5114021.6 18.4 34.0
gpt-4-technical-report-182.31230 21.2 18.2 34.4
gpt-4-technical-report-173.4262 4 33.529.743.7
gpt-4-technical-report-181.21372 24.020.436.1
text-embeddings-by-weakly-supervised84.4 ± .776 ± 50 ± 018.5 ± .615.4 ± .532.3 ± .4
learning-word-vectors-for-157-languages85.5 ± .562 ± 30 ± 0 15.8 ± .313.0 ± .230.4 ± .2
learning-word-vectors-for-157-languages84.2 ± .580 ± 40 ± 018.4 ± .415.2 ± .332.1 ± .3
text-embeddings-by-weakly-supervised83.8 ± .689 ± 6 1 ± 019.5 ± .4 16.3 ± .433.1 ± .3
pre-training-of-deep-bidirectional-protein88.3 ± .533 ± 20 ± 010.3 ± .38.2 ± .3 26.5 ± .2
gpt-4-technical-report-172.9269732.8 29.143.4
large-language-models-are-fixated-by-red-1-1405285---
gpt-4-technical-report-180.61492 25.4 22.0 37.3
glove-global-vectors-for-word-representation84.9 ± .468 ± 40 ± 017.6 ± .414.4 ± .3 31.5 ± .3
gpt-4-technical-report-175.8239630.727.241.5
deep-contextualized-word-representations-55 ± 40 ± 014.5 ± .411.8 ± .429.5 ± .3
distilbert-a-distilled-version-of-bert-49 ± 40 ± 0 14.0 ± .311.3 ± .329.1 ± .2
gpt-4-technical-report-180.9140024.721.336.8
gpt-4-technical-report-173.62493 32.3 28.5 42.8
roberta-a-robustly-optimized-bert-pretraining-29 ± 30 ± 09.4 ± .4 8.4 ± .3 26.7 ± .2
gpt-4-technical-report-173.7272533.629.943.9
mpnet-masked-and-permuted-pre-training-for86.3 ± .4 50 ± 40 ± 014.3 ± .5 11.7 ± .429.4 ± .3