Pelayo Arbués

Recent Notes

Why Software Engineers Should Learn a Bit of Data Science
Apr 01, 2025
A recommender beast
Feb 05, 2025
The next generation of weak learners
Jan 28, 2025

See 89 more →

❯

Literature Notes

❯

❯

Mistral Large 2

Mistral Large 2

Apr 16, 20252 min read

articles
literature-note

Metadata

Author: Simon Willison
Full Title: Mistral Large 2
URL: https://simonwillison.net/2024/Jul/24/mistral-large-2/#atom-everything

Highlights

Mistral Large 2 (via) The second release of a GPT-4 class open weights model in two days, after yesterday’s Llama 3.1 405B. (View Highlight)
The weights for this one are under Mistral’s Research License, which “allows usage and modification for research and non-commercial usages” - so not as open as Llama 3.1. You can use it commercially via the Mistral paid API. (View Highlight)
Mistral Large 2 is 123 billion parameters, “designed for single-node inference” (on a very expensive single-node!) and has a 128,000 token context window, the same size as Llama 3.1. (View Highlight)
Notably, according to Mistral’s own benchmarks it out-performs the much larger Llama 3.1 405B on their code and math benchmarks. They trained on a lot of code: (View Highlight)
Following our experience with Codestral 22B and Codestral Mamba, we trained Mistral Large 2 on a very large proportion of code. Mistral Large 2 vastly outperforms the previous Mistral Large, and performs on par with leading models such as GPT-4o, Claude 3 Opus, and Llama 3 405B. (View Highlight)
One of the key focus areas during training was to minimize the model’s tendency to “hallucinate” or generate plausible-sounding but factually incorrect or irrelevant information. This was achieved by fine-tuning the model to be more cautious and discerning in its responses, ensuring that it provides reliable and accurate outputs. (View Highlight)

Graph View

Metadata
Highlights

Now Reading

![CDATA[Not Boring by Packy McCormick]]>
Apr 16, 2025

See 1293 more →

Created with Quartz, © 2025

Bluesky
Linkedin
Mastodon
Twitter
Unsplash
GitHub
RSS