TransGPT-pt&sft Traffic Dialogue Pre-training Dataset
Date
Size
Publish URL
Categories
* This dataset supports online use.Click here to jump.
This dataset is part of TransGPT, the first comprehensive transportation model in China released by Beijing Jiaotong University in 2024. It contains about 346,000 text data in the transportation field, which are used for pre-training in the field, and about 58,000 dialogue data in the transportation field for fine-tuning.TransGPT: Multi-modal Generative Pre-trained Transformer for Transportation".
Data sources include single-mode and multi-modal data, such as traffic sign encyclopedia, driving test question bank, global tourist attractions, etc. The data set covers multiple transportation-related industries such as road engineering, bridge engineering, tunnel engineering, highway transportation, water transportation, urban public transportation, transportation economics, transportation safety, etc., providing general knowledge.