ODSQA Open Domain Spoken Question Answering Dataset
Date
Size
Publish URL
ODSQA (Open-Domain Spoken Question Answering Dataset) From ODSQA: Open-domain Spoken Question Answering Dataset This is a Chinese dataset. In addition, an English dataset Spoken-SQuAD is also provided.hereturn up.
Spoken-SQuAD It is a spoken intelligent question-answering corpus generated from the SQuAD dataset through Google's text-to-speech (TTS) system. Although Spoken-SQuAD is large enough to train the most advanced intelligent question-answering models, it is artificially generated, so there is still a certain gap with real spoken question-answering. Therefore, researchers released an SQA dataset containing more than three thousand questions, called ODSQA. It is currently the largest real SQA dataset for extraction-based intelligent question-answering tasks.