Command Palette
Search for a command to run...
Nemotron Personas USA (USA) Personality Dataset
Nemotron-Personas-USA is a large-scale synthetic user profile dataset released by NVIDIA in 2025, designed to support the training and evaluation of large language models (LLMs) and intelligent agent systems in tasks such as dialogue generation, role simulation, user modeling, and diverse behavior analysis.
This dataset contains approximately 1 million virtual character records, totaling 6 million persona fields and 16 contextual information fields. The data covers all 50 states of the United States, as well as Puerto Rico and the Virgin Islands, and includes 29,000 geographic zip codes (ZCTAs) and 15,200 cities/regions, providing a relatively complete picture of the geographical and social distribution of the U.S. population.
The dataset contains approximately 970,000 unique names and covers more than 560 occupational categories. The occupational distribution references real-world occupational statistics, ensuring good social representativeness. Each data point consists of multidimensional fields, including structured demographic information such as age, gender, education level, income, occupation, and location, as well as natural language persona descriptions such as interests, values, lifestyle, and personal goals, forming a composite persona representation that combines structured information with unstructured text.

Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.