xAI Launches New Model Pricing and Tool Fees
xAI offers a range of language and image generation models with detailed pricing based on token usage, model capabilities, and additional tool integrations. The pricing structure varies by model type, modalities, context window size, and usage patterns. Language models are priced per million tokens. Grok-code-fast-1256 has a context window of 2 million tokens and supports up to 480 tokens per request. Grok-4-fast-reasoning and Grok-4-fast-non-reasoning both have a 4 million token context window and allow 480 tokens per request. Grok-4-070925 also supports 2 million tokens with a 480 token limit. Grok-3-mini and Grok-3 both support 131,072 tokens per request with limits of 480 and 600 respectively. Grok-2-vision models support both text and image inputs with a 32,768 token context window and are available in two regions: us-east-1 and eu-west-1. Image generation models are priced per image output. Grok-2-image-1212 costs $300 per image generated. Tool usage is billed separately based on token consumption and tool invocations. Tools include Web Search, X Search, Code Execution, View Image, View X Video, Collections Search, and Remote MCP Tools. Web Search, X Search, and Code Execution cost $10 per 1,000 calls. Collections Search costs $2.50 per 1,000 requests. Remote MCP Tools are billed based on tokens used, not invocation count. View Image and View X Video tools do not charge per invocation but are billed based on the number of tokens processed from the image or video content. Live Search costs $25 per 1,000 sources requested. Each source type—Web, X, News, RSS—is counted individually, regardless of how many citations are returned. The number of sources used appears in the response under response.usage.num_sources_used. Documents Search via the Collections API is priced at $2.50 per 1,000 requests. File and Collections Storage are currently free. A violation fee of $0.05 per request applies if a request is flagged by the system for violating usage guidelines, though this is rare for most users. Model aliases like or -latest are available to help users automatically access the latest model version with updated features. It is recommended for most users to use these aliases to benefit from ongoing improvements. Model access and availability may vary by region, account type, and other factors. Billing details are managed through the xAI Console, and users can view their usage, including cached prompt tokens, in the usage object. Caching is enabled by default and helps reduce costs for repeated prompts by reusing stored prompt data. Token counting follows standard practices, and total prompt length—including conversation history—must not exceed the model’s context window. For full details on model capabilities, input/output types, and pricing, refer to the official xAI Console or API documentation.
