Text To Sql On Bird Big Bench For Large Scale

Execution Accuracy % (Dev)

Execution Accuracy % (Test)

评测结果

各个模型在此基准测试上的表现结果

			Paper Title
DSAIR + GPT-4o	74.32	74.12	-
XiYan-SQL	73.34	75.63	A Preview of XiYan-SQL: A Multi-Generator Ensemble Framework for Text-to-SQL
CHASE-SQL + Gemini	73.14	74.06	CHASE-SQL: Multi-Path Reasoning and Preference Optimized Candidate Selection in Text-to-SQL
ExSL + granite-34b-code	72.43	73.17	-
Insights AI	72.16	70.26	-
OpenSearch-SQL+ v2 + GPT-4o	69.3	72.28	-
PURPLE + RED + GPT-4o	68.12	70.21	-
Arcwise + GPT-4o	67.99	66.21	-
Distillery + GPT-4o	67.21	71.83	The Death of Schema Linking? Text-to-SQL in the Age of Well-Reasoned Language Models
RECAP + Gemini	66.95	69.03	-
MSL-SQL + DeepSeek-V2.5	66.82	64.00	-
MSc-SQL	65.6	-	MSc-SQL: Multi-Sample Critiquing Small Language Models For Text-To-SQL Translation
ByteBrain	65.45	68.87	-
ExSL + granite-20b-code	65.38	67.86	-
CHESS	65	66.69	CHESS: Contextual Harnessing for Efficient SQL Synthesis
SCL-SQL	64.73	65.23	-
SFT CodeS-15B + SQLFixAgent	64.62	-	-
MCS-SQL + GPT-4	63.36	65.45	-
PURPLE + GPT-4o	62.97	64.51	-
GRA-SQL	62.58	63.22	-

0 of 40 row(s) selected.