Robot Task Planning On Sheetcopilot
평가 지표
Pass@1
평가 결과
이 벤치마크에서 각 모델의 성능 결과
모델 이름 | Pass@1 | Paper Title | Repository |
---|---|---|---|
SheetAgent (GPT-3.5) | 61.1% | SheetAgent: Towards A Generalist Agent for Spreadsheet Reasoning and Manipulation via Large Language Models | - |
SheetCopilot (NIPS2023) | 44.3% | SheetCopilot: Bringing Software Productivity to the Next Level through Large Language Models |
0 of 2 row(s) selected.