PokerBench Poker Game Evaluation Dataset
Date
Size
Publish URL
Categories
PokerBench is a poker game evaluation dataset developed by a research team from the University of California, Berkeley and Georgia Institute of Technology in 2025. It aims to evaluate the performance of large language models (LLMs) in complex, strategic poker games.PokerBench: Training Large Language Models to become Professional Poker Players". The dataset contains 11k key scenarios, divided into 1k pre-flop and 10k post-flop scenarios, covering a wide range of game situations.
The dataset was created based on the Game Theory Optimal (GTO) poker strategy and developed in collaboration with professional poker players to ensure its diversity and representativeness. By using the GTOWizard and WASM-Postflop tools, the dataset ensures that the decision for each scenario is in line with the optimal strategy. In addition, the construction of the dataset also takes into account the complex decision trees in poker games, and ensures the comprehensiveness and efficiency of the evaluation through filtering and pruning strategies.
With this dataset, researchers can quickly evaluate the model's performance in poker games, especially in areas such as mathematical reasoning, strategic planning, and predicting opponent behavior.
