Command Palette
Search for a command to run...
CulturalGround Multilingual Cultural Visual Question Answering Dataset
Date
Paper URL
License
Apache 2.0
CulturalGround is a multilingual and multimodal visual question answering dataset for cultural knowledge alignment released by NeuLab at Carnegie Mellon University in 2025. "Grounding Multilingual Multimodal LLMs With Cultural Knowledge", which aims to improve the multimodal large language model's understanding and reasoning capabilities of niche cultural entities and low-resource languages.
The dataset contains 22 million high-quality, culturally rich question and answer pairs, covering 42 countries and 39 languages. Each sample includes an image, question, and answer, organized by country and language to directly align model predictions with cultural entities.
The data includes:
- Image and entity metadata (country/language/entity ID/cultural attributes)
- Visual Q&A samples: Open-ended questions and multiple-choice/true/false questions, with both unfiltered and filtered versions
- Multilingual text: Questions and answers in 39 languages, supporting cross-language training and evaluation

Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.