Logical Reasoning
Benchmark List
All benchmarks related to this task
lingoly
Best model: Claude Opus
Metrics
View Details
big-bench-formal-fallacies-syllogisms
Metrics
View Details
big-bench-logic-grid-puzzle
Metrics
View Details
big-bench-logical-fallacy-detection
Metrics
View Details
big-bench-penguins-in-a-table
Metrics
View Details
big-bench-reasoning-about-colored-objects
Metrics
View Details
big-bench-strategyqa
Metrics
View Details
big-bench-temporal-sequences
Metrics
View Details
ruworldtree
Metrics
View Details
winograd-automatic
Metrics
View Details