HyperAIHyperAI

Long Context Understanding On Ada Leval

Metrics

12k
16k
1k
2k
4k
6k
8k

Results

Performance results of various models on this benchmark

Model Name
12k
16k
1k
2k
4k
6k
8k
Paper TitleRepository
Vicuna-7b-v1.5-16k1.91.037.011.15.83.21.8Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena-
Claude-212.011.065.043.523.515.017.0--
LongChat-7b-v1.5-32k1.60.832.410.75.73.11.9Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena-
Vicuna-13b-v1.5-16k1.40.953.429.213.14.32.2Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena-
GPT-3.5-Turbo-11062.52.561.548.541.529.517.0--
ChatGLM3-6b-32k0.90.539.818.89.05.03.4GLM-130B: An Open Bilingual Pre-trained Model-
ChatGLM2-6b-32k0.00.331.210.94.51.61.6GLM-130B: An Open Bilingual Pre-trained Model-
InternLM2-7b2.00.858.649.533.912.313.4InternLM2 Technical Report-
GPT-4-Turbo-012552.044.573.573.565.563.056.5GPT-4 Technical Report-
GPT-4-Turbo-110649.544.074.073.567.559.553.5GPT-4 Technical Report-
0 of 10 row(s) selected.
Long Context Understanding On Ada Leval | SOTA | HyperAI