HyperAI초신경

Text To Sql On Bird Big Bench For Large Scale

평가 지표

Execution Accuracy % (Dev)
Execution Accuracy % (Test)

평가 결과

이 벤치마크에서 각 모델의 성능 결과

비교 표
모델 이름Execution Accuracy % (Dev)Execution Accuracy % (Test)
모델 168.1270.21
msc-sql-multi-sample-critiquing-small65.6-
모델 359.7160.71
모델 458.4760.37
모델 562.9764.51
text-to-sql-empowered-by-large-language54.7657.41
can-llm-already-serve-as-a-database-interface37.2239.30
모델 855.4863.39
모델 972.4373.17
can-llm-already-serve-as-a-database-interface--
모델 1155.4863.39
can-llms-effectively-leverage-structural42.7049.02
모델 1364.7365.23
모델 1467.9966.21
모델 1565.4568.87
모델 1663.3665.45
can-llm-already-serve-as-a-database-interface34.3536.47
chase-sql-multi-path-reasoning-and-preference73.1474.06
xiyan-sql-a-multi-generator-ensemble73.3475.63
모델 2069.372.28
chess-contextual-harnessing-for-efficient-sql6566.69
can-llms-effectively-leverage-structural46.3554.89
모델 2360.564.84
모델 2462.5863.22
모델 2557.1759.25
모델 2658.562.66
mac-sql-multi-agent-collaboration-for-text-to57.5659.59
모델 2866.8264.00
모델 2965.3867.86
모델 3064.62-
knowledge-to-sql-enhancing-sql-generation48.92-
the-death-of-schema-linking-text-to-sql-in67.2171.83
can-llm-already-serve-as-a-database-interface36.6440.08
모델 3437.6847.74
din-sql-decomposed-in-context-learning-of-150.7255.90
모델 3666.9569.03
모델 3772.1670.26
can-llm-already-serve-as-a-database-interface27.3833.04
모델 3974.3274.12
모델 4061.3464.95