Mistral AI released Mistral Large 2, their largest dense model with 123 billion parameters. This model fits on a single H100 node and supports non-commercial open-weights usage. The release follows Meta’s Llama 405B. (View Highlight)
• 123 billion parameters on a single H100 node
• 128k context window, supporting dozens of languages
• Achieves 84% on MMLU, 8.63 on MT Bench, and 92% on HumanEval
• Available on Hugging Face for research and non-commercial use
• Commercial license available for deployment (View Highlight)
Enhanced Coding and Reasoning Capabilities
Mistral Large 2 excels in coding, trained on 80+ programming languages. It matches or surpasses models like GPT-4o, Opus-3, and Llama-3 405B in coding benchmarks. (View Highlight)
Compared to its predecessor, Mistral Large 1, it has reduced hallucinations and improved reliability, making it more dependable for complex tasks. (View Highlight)
Multilingual Training and Performance
Trained on a significant amount of multilingual data, Mistral Large 2 excels in languages like English, French, German, Spanish, Italian, Portuguese, Chinese, Japanese, Korean, Arabic, and Hindi. (View Highlight)
Their AI Foundry lets you create custom “supermodels” tailored to your needs and train them with proprietary data as well as synthetic data generated from Llama 3.1 405B. (View Highlight)
It can handle, data curation, synthetic data generation, fine-tuning with proprietary data, accurate response retrieval, comprehensive evaluation and deployment. (View Highlight)