Mistral AI Launches 7B-Parameter Open Model, Challenging Proprietary AI Dominance
Generative AI, particularly large language models (LLMs), is transforming content creation, knowledge retrieval, and problem-solving by generating human-like text, commands, and content based on human instructions. In the coming years, generative AI is expected to redefine culture and daily life, altering the way we interact with machines and each other. At present, most advances in generative AI have been driven by proprietary solutions. However, much like how open-source projects such as Webkit, Linux, and Kubernetes outperformed their proprietary counterparts, Mistral AI believes that open-source models will soon dominate the generative AI landscape. Community-driven development is seen as the key to addressing issues of censorship and bias, essential for a technology that is shaping our future. Mistral AI’s mission is to lead the revolution of open-source generative AI models. By training and releasing their own models, they aim to create a credible alternative to the dominant AI oligopoly. Open-source models offer significant advantages over proprietary ones, including full control over the application engine, the ability to tailor models to specific tasks, and cost and latency management. Enterprises can deploy these models on their infrastructure, ensuring data privacy and reducing dependencies on black-box solutions that may introduce intellectual property (IP) leakage risks and limited customization capabilities. Mistral AI is also committed to using open-source models as safeguards against the misuse of generative AI. Public and private organizations can audit these models for flaws and detect bad usage or misinformation. This transparency and control are crucial as the volume of generated content is expected to increase. To kick-start their mission, Mistral AI has recently released "Mistral 7B," a 7-billion-parameter model that outperforms all currently available open-source models with up to 13 billion parameters on standard English and code benchmarks. This achievement came after three months of intense work, where the team rebuilt a top-performance MLops stack and designed a sophisticated data processing pipeline from scratch. Mistral 7B's performance highlights the potential of smaller models when optimized with conviction. Over the past two years, the threshold for models achieving above 60% accuracy on the Multitask Language Understanding (MMLU) benchmark has shifted from Gopher (280 billion parameters, DeepMind, 2021) to Chinchilla (70 billion parameters, DeepMind, 2022) to Llama 2 (34 billion parameters, Meta, July 2023) and now to Mistral 7B. This model can handle a variety of tasks, including summarization, structuration, and question answering, and it processes and generates text faster than large proprietary solutions at a fraction of the cost. Moving forward, Mistral AI plans to actively engage with the user community through a GitHub repository and a Discord channel. These platforms will foster collaboration, support, and transparent communication. Mistral AI is committed to releasing the strongest open-source models in parallel with developing their commercial offerings. They will offer optimized proprietary models for on-premise and virtual private cloud deployment, distributed as white-box solutions that provide both model weights and source code. Hosted solutions and dedicated enterprise deployments are also in the works. Mistral AI is currently training larger models and exploring novel architectures. They anticipate further releases this fall, continuing their push to make open-source generative AI the preferred choice for a wide range of applications. Industry insiders view Mistral AI's approach as a significant step towards democratizing AI technology. Their commitment to open-source development not only promotes transparency and reduces bias but also accelerates innovation through community collaboration. Mistral AI, founded by a team with extensive experience in LLM development, is well-positioned to lead this movement and challenge the established players in the AI market.