Resources - Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena | Papers | HyperAI

HyperAI

Main

GPU

Console
Docs
Pricing

Pulse

News

Resources

Papers
Notebooks
Datasets
Wiki

Benchmarks

SOTA
LLM Models
GPU Leaderboard

Community

Events

Utility

About Terms of Service Privacy Policy
English

Command Palette

Search for a command to run...

HyperAI
Papers
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena

8 months ago

Preference Modeling

Summary Paper Benchmarks Resources

opengvlab/multi-modality-arena561

pytorch

lm-sys/routellm4.8k

pytorch

formulamonks/llm-benchmarker-suite49

pytorch

ojiyumm/mt_bench_rwkv0

pytorch

lm-sys/fastchat39.5k

Official

pytorch

ilyagusev/ping_pong_bench117

theoremone/llm-benchmarker-suite49

pytorch

PAIR-code/llm-comparator526

tf

kuk/rulm-sbs261

dongping-chen/mllm-as-a-judge92

pytorch

bjoernpl/fasteval1

Build the Future of Artificial Intelligence

About

About Us Dataset Help

Products

News Notebooks Datasets Wiki

Links

© HyperAI

GitHub Discord X (formerly Twitter)

HyperAI

Main

GPU

Console
Docs
Pricing

Pulse

News

Resources

Papers
Notebooks
Datasets
Wiki

Benchmarks

SOTA
LLM Models
GPU Leaderboard

Community

Events

Utility

About Terms of Service Privacy Policy
English

Command Palette

Search for a command to run...

HyperAI
Papers
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena

8 months ago

Preference Modeling

Summary Paper Benchmarks Resources

opengvlab/multi-modality-arena561

pytorch

lm-sys/routellm4.8k

pytorch

formulamonks/llm-benchmarker-suite49

pytorch

ojiyumm/mt_bench_rwkv0

pytorch

lm-sys/fastchat39.5k

Official

pytorch

ilyagusev/ping_pong_bench117

theoremone/llm-benchmarker-suite49

pytorch

PAIR-code/llm-comparator526

tf

kuk/rulm-sbs261

dongping-chen/mllm-as-a-judge92

pytorch

bjoernpl/fasteval1

Build the Future of Artificial Intelligence

About

About Us Dataset Help

Products

News Notebooks Datasets Wiki

Links

© HyperAI

GitHub Discord X (formerly Twitter)