Task 1 Grouping On Ocw

المقاييس

Wasserstein Distance (WD)

# Correct Groups

# Solved Walls

Adjusted Mutual Information (AMI)

Adjusted Rand Index (ARI)

Fowlkes Mallows Score (FMS)

النتائج

نتائج أداء النماذج المختلفة على هذا المعيار القياسي

							Paper Title
Human Performance	-	1405	285	-	-	-	Large Language Models are Fixated by Red Herrings: Exploring Creative Problem Solving and Einstellung Effect using the Only Connect Wall Dataset
GPT-4 (3-shot)	73.7	272	5	33.6	29.9	43.9	GPT-4 Technical Report
GPT-4 (5-shot)	72.9	269	7	32.8	29.1	43.4	GPT-4 Technical Report
GPT-4 (1-shot)	73.4	262	4	33.5	29.7	43.7	GPT-4 Technical Report
GPT-4 (100-shot)	73.6	249	3	32.3	28.5	42.8	GPT-4 Technical Report
GPT-4 (0-shot)	75.8	239	6	30.7	27.2	41.5	GPT-4 Technical Report
GPT-3.5-turbo (5-shot)	80.6	149	2	25.4	22.0	37.3	GPT-4 Technical Report
GPT-3.5-turbo (3-shot)	80.9	140	0	24.7	21.3	36.8	GPT-4 Technical Report
GPT-3.5-turbo (10-shot)	81.2	137	2	24.0	20.4	36.1	GPT-4 Technical Report
GPT-3.5-turbo (1-shot)	82.3	123	0	21.2	18.2	34.4	GPT-4 Technical Report
GPT-3.5-turbo (0-shot)	82.5	114	0	21.6	18.4	34.0	GPT-4 Technical Report
E5 (BASE)	83.8 ± .6	89 ± 6	1 ± 0	19.5 ± .4	16.3 ± .4	33.1 ± .3	Text Embeddings by Weakly-Supervised Contrastive Pre-training
FastText (Crawl)	84.2 ± .5	80 ± 4	0 ± 0	18.4 ± .4	15.2 ± .3	32.1 ± .3	Learning Word Vectors for 157 Languages
E5 (LARGE)	84.4 ± .7	76 ± 5	0 ± 0	18.5 ± .6	15.4 ± .5	32.3 ± .4	Text Embeddings by Weakly-Supervised Contrastive Pre-training
GloVe	84.9 ± .4	68 ± 4	0 ± 0	17.6 ± .4	14.4 ± .3	31.5 ± .3	-
FastText (News)	85.5 ± .5	62 ± 3	0 ± 0	15.8 ± .3	13.0 ± .2	30.4 ± .2	Learning Word Vectors for 157 Languages
ELMo (LARGE)	-	55 ± 4	0 ± 0	14.5 ± .4	11.8 ± .4	29.5 ± .3	Deep contextualized word representations
all-mpnet (BASE)	86.3 ± .4	50 ± 4	0 ± 0	14.3 ± .5	11.7 ± .4	29.4 ± .3	MPNet: Masked and Permuted Pre-training for Language Understanding
DistilBERT (BASE)	-	49 ± 4	0 ± 0	14.0 ± .3	11.3 ± .3	29.1 ± .2	DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
BERT (LARGE)	88.3 ± .5	33 ± 2	0 ± 0	10.3 ± .3	8.2 ± .3	26.5 ± .2	Pre-Training of Deep Bidirectional Protein Sequence Representations with Structural Information

0 of 22 row(s) selected.

Command Palette

Task 1 Grouping On Ocw

المقاييس

النتائج