HyperAI

CoSQL Conversational Text to SQL Dataset

Date

5 months ago

Size

100.44 MB

Organization

Yale University

The CoSQL (Conversational Text-to-SQL Challenge) dataset was proposed by Yale University at EMNLP2019. It aims to build a dataset for cross-domain, general database query dialogue systems.

CoSQL contains 3k+ groups of conversations, a total of 10k+ annotated SQL queries, and the content spans 200 databases. The databases used by different groups of data do not overlap, in order to examine the robustness of the model. The dataset simulates database queries in real scenarios. Users may have multiple rounds of inquiries, requiring the system to have the ability to integrate information.

CoSQL consists of 3 tasks:

  • SQL-grounded dialogue state tracking: Based on the interaction history, it is converted into corresponding SQL statements.
  • Natural language response generation: Generate natural language responses based on SQL statements and returned results.
  • User dialogue act prediction: For each user’s question, determine which DB user tag it belongs to.
CoSQL.torrent
Seeding 3Downloading 1Completed 43Total Downloads 66
  • CoSQL/
    • README.md
      1.54 KB
    • README.txt
      3.09 KB
      • data/
        • cosql_dataset.zip
          100.44 MB