Mt Bench 101
Metrics
adaptability reflection(sa)
adaptability reflection(sc)
avg.
interactivity questioning(ic)
interactivity questioning(pi)
interference(cc)
interference(ts)
llm_model
memory cm
model_url
organization
parameters
perceptivity understanding(ar)
perceptivity understanding(si)
reasoning(gr)
reasoning(mr)
release_date
rephrasing(cr)
rephrasing(fr)
updated_time
Results
Performance results of various models on this benchmark
| Paper Title | Code | |||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| API | 4.97 | 8.45 | 6.53 | 5.23 | 5.11 | 8.5 | 8.23 | Llama2-7B-Chat | 7.64 | https://huggingface.co/meta-llama/Llama-2-7b-chat-hf | Meta | 7B | 7.92 | 6.21 | 3.83 | 1.88 | 2023.7.19 | 8.32 | 8.56 | 2024.6.25 | - |
0 of 1 row(s) selected.