HyperAI超神经

Emotional Intelligence On Emotional

评估指标

EQ-Bench Score

评测结果

各个模型在此基准测试上的表现结果

模型名称
EQ-Bench Score
Paper TitleRepository
OpenAI gpt-3.5-061349.17EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models
lmsys/vicuna-33b-v1.336.52EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models
lmsys/vicuna-13b-v1.1 32.85EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models
OpenAI text-davinci-00239.44EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models
OpenAI text-davinci-00343.73EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models
meta-llama/Llama-2-70b-chat-hf51.56EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models
OpenAI ADA2.25EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models
meta-llama/Llama-2-7b-chat-hf25.43EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models
OpenAI gpt-3.5-turbo-030147.61EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models
Intel/neural-chat-7b-v3-143.61EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models
Qwen/Qwen-72B-Chat52.44EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models
openchat/openchat 3.537.08EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models
migtissera/SynthIA-70B-v1.554.83EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models
Open-Orca/Mistral-7B-OpenOrca44.40EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models
OpenAI gpt-4-061362.52EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models
OpenAI gpt-4-031453.39EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models
Qwen/Qwen-14B-Chat43.76EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models
Koala 13B24.92EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models
meta-llama/Llama-2-13b-chat-hf33.02EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models
OpenAI ADA2.25EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models
0 of 24 row(s) selected.