HyperAI

Probing Language Models On Kamel

Metrics

Average F1

Results

Performance results of various models on this benchmark

Comparison Table
Model NameAverage F1
kamel-knowledge-analysis-with-multitoken17.62