HyperAI
Back to Headlines

Chinese AI Startup MiniMax Unveils Open Source MiniMax-M1: 1 Million Token Context and Hyper-Efficient Reinforcement Learning

2 days ago

Chinese AI startup MiniMax, renowned in the West for its realistic AI video model Hailuo, has unveiled its latest large language model (LLM) named MiniMax-M1. This groundbreaking model is fully open source under the Apache 2.0 license, allowing businesses and developers to use and modify it without any restrictions or fees. The model's unique features set it apart in the field of AI, particularly in long-context reasoning and efficient compute performance. Released on June 16, 2025, MiniMax-M1 is available on Hugging Face and GitHub, marking the beginning of "MiniMaxWeek," a series of upcoming product announcements. The core strength of MiniMax-M1 lies in its context window, which can handle up to 1 million input tokens and produce up to 80,000 output tokens. To put this in perspective, OpenAI's GPT-4 can manage a context window of only 128,000 tokens, and Google's Gemini 2.5 Pro offers a similar 1 million input token capacity but with a rumored 2 million token context in development. The context window in an LLM determines the maximum amount of text the model can process at once. Tokens, the fundamental units of text, encompass words, word parts, punctuation, and code symbols. They are transformed into numerical vectors for the model to interpret and manipulate. With its extensive context window, MiniMax-M1 can process and generate information equivalent to a small collection of books, making it highly suitable for long-context reasoning tasks. MiniMax-M1's efficiency is another highlight. It is trained using a hybrid Mixture-of-Experts (MoE) architecture and a lightning attention mechanism, which significantly reduces inference costs. According to the technical report, MiniMax-M1 requires only 25% of the floating-point operations (FLOPs) needed by DeepSeek R1 to generate a 100,000-token sequence. This efficiency is attributed to a custom reinforcement learning (RL) algorithm called CISPO, which clips importance sampling weights instead of token updates, along with the hybrid attention design that optimizes scaling. The model comes in two variants: MiniMax-M1-40k and MiniMax-M1-80k, named based on their output lengths or "thinking budgets." Both variants are built on the MiniMax-Text-01 foundation and consist of 456 billion parameters, with 45.9 billion activated per token. The training cost for M1 was reported at $534,700, an impressive achievement considering that DeepSeek's R1 model cost around $5-6 million, and OpenAI's GPT-4, a model from two years ago, exceeded $100 million in training expenses. The reduced cost is mainly due to the lower prices of GPUs and energy-efficient training methods. MiniMax-M1 has been rigorously tested against established benchmarks for advanced reasoning, software engineering, and tool-use capabilities. On the AIME 2024 mathematics competition benchmark, the M1-80k model achieved 86.0% accuracy, outperforming other open-weight models like DeepSeek R1 and Qwen3-235B-A22B. While closed-weight models such as OpenAI's GPT-4 and Google's Gemini 2.5 Pro still lead in some benchmarks, MiniMax-M1 significantly narrows the performance gap while maintaining its open-source accessibility. Deployment options for MiniMax-M1 are versatile. MiniMax recommends using vLLM, a serving backend optimized for large model workloads, due to its memory efficiency and batch request handling. Alternatively, the model can be deployed using the Transformers library, offering flexibility for different use cases. MiniMax-M1 also includes structured function calling capabilities and a chatbot API equipped with online search, video and image generation, speech synthesis, and voice cloning tools. These features enhance its agentic behavior, making it suitable for real-world applications. For technical decision-makers and enterprise buyers, MiniMax-M1 addresses several key challenges. Engineering leads can benefit from its lower operational cost profile and advanced reasoning tasks, potentially reducing preprocessing efforts for extensive enterprise documents or log data. Those managing AI orchestration pipelines can integrate MiniMax-M1 more easily due to its compatibility with established tools like vLLM and Transformers, simplifying scaling strategies and enhancing internal copilot or agent-based systems. Data platform teams will appreciate the model's support for structured function calling and its open-source nature, which allows for customization and avoids vendor lock-in. Security leads may also consider M1 for secure, on-premises deployment of high-capability models, ensuring sensitive data remains within company networks. In summary, MiniMax-M1 represents a major advancement in the AI landscape, combining open access with state-of-the-art architecture and computational efficiency. This flexibility and cost-effectiveness make it a compelling choice for organizations looking to implement or enhance their AI capabilities, especially in scenarios requiring deep reasoning and long-range context understanding. Industry insiders view MiniMax-M1 as a game-changer. The model's open-source nature democratizes access to advanced AI, fostering innovation and reducing barriers to entry for smaller businesses and developers. MiniMax's focus on practical, scalable solutions aligns with the growing demand for AI models that can be easily integrated and fine-tuned. With a track record of producing impactful AI technologies, MiniMax continues to position itself as a leader in the AI industry, and MiniMaxWeek promises further exciting developments in this space. Stay tuned for more updates from the company.

Related Links