HyperAI
Accueil
Actualités
Articles de recherche récents
Tutoriels
Ensembles de données
Wiki
SOTA
Modèles LLM
Classement GPU
Événements
Recherche
À propos
Français
Système
HyperAI
Toggle sidebar
Rechercher sur le site...
⌘
K
Connexion
Connexion
Accueil
SOTA
Sentence Ordering
Sentence Ordering On Econlogicqa
Sentence Ordering On Econlogicqa
Métriques
Accuracy
Résultats
Résultats de performance de divers modèles sur ce benchmark
Columns
Nom du modèle
Accuracy
Paper Title
Repository
Gemma-7B-IT
0.0231
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning
Mistral-7B-Instruct-v0.1
0.1538
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning
Yi-6B-Chat
0.2077
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning
Mistral-7B-Instruct-v0.2
0.3154
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning
Llama-2-7B
0.0077
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning
Zephyr-7B-Alpha
0.2308
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning
Yi-6B
0.0385
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning
Llama-3-8B
0.2385
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning
Mistral-7B-v0.2
0.2615
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning
Mistral-7B-v0.1
0.2615
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning
Zephyr-7B-Beta
0.1769
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning
Llama-2-13B-Chat
0.1462
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning
Llama-2-7B-Chat
0.0923
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning
Llama-3-8B-Instruct
0.3462
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning
Gemma-2B-IT
0.0846
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning
GPT-3.5-Turbo
0.3769
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning
GPT-4
0.5538
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning
GPT-4-Turbo
0.5692
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning
0 of 18 row(s) selected.
Previous
Next