HyperAIHyperAI

Long Context Understanding On Ada Leval Tsort

المقاييس

128k
16k
2k
32k
4k
64k
8k

النتائج

نتائج أداء النماذج المختلفة على هذا المعيار القياسي

اسم النموذج
128k
16k
2k
32k
4k
64k
8k
Paper TitleRepository
GPT-4-Turbo-01252.05.515.52.016.54.08.5GPT-4 Technical Report-
GPT-3.5-Turbo-1106-5.54.0-4.5-4.5--
ChatGLM2-6b-32k-0.90.9-0.2-0.7GLM-130B: An Open Bilingual Pre-trained Model-
ChatGLM3-6b-32k-0.72.3-2.4-2.0GLM-130B: An Open Bilingual Pre-trained Model-
LongChat-7b-v1.5-32k-2.55.3-5.0-3.1Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena-
Vicuna-7b-v1.5-16k-1.75.3-2.2-2.3Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena-
Claude-2-3.05.00.05.00.04.5--
Vicuna-13b-v1.5-16k-3.15.4-5.0-2.4Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena-
GPT-4-Turbo-11066.03.518.56.015.56.07.5GPT-4 Technical Report-
InternLM2-7b-4.35.1-3.9-5.1InternLM2 Technical Report-
0 of 10 row(s) selected.