Post
332
When
@MistralAI
drops a blog post labelled "Large Enough," it's going to get serious! ๐๐ก
- Mistral-Large-Instruct-2407, just call it Mistral-Large2, is a 123B parameters Instruct model with 128k context ๐๐
- Multilingual in 11 languages; English ๐ฌ๐ง, French ๐ซ๐ท, German ๐ฉ๐ช, Spanish ๐ช๐ธ, Italian ๐ฎ๐น, Chinese ๐จ๐ณ, Japanese ๐ฏ๐ต, Korean ๐ฐ๐ท, Portuguese ๐ต๐น, Dutch ๐ณ๐ฑ, and Polish ๐ต๐ฑ. ๐ฃ๏ธ๐บ๏ธ
- Also highly focused on programming, trained on 80+ coding languages such as Python, Java, C, C++, Javascript, bash ๐ป๐ง
- Supports native function calling and structured output. ๐ ๏ธ๐
- Released under Mistral Research License (Non-Commercial License, Research only๐)
- Open weights only๐, no data or code released ๐๐
Definitely firing shots at @Meta Llama3.1: ๐ฏ๐ฅ
MMLU - 84.0% (ML2) vs 79.3% (L3.1-70B) vs 85.2% (L3.1-405B)
GSM8K - 93% (ML2) vs 95.5% (L3.1-70B-Ins) vs 96.8% (L3.1-405B-Ins)
Also, it's kinda chunky! ๐ฆ๐ช
fp16/ bf16 - ~250GB VRAM
fp8/ int8 - ~125GB VRAM
int4 - ~60GB VRAM
I tried quantising it to AWQ and GPTQ, but couldn't with 30GB V-RAM. โ๐ฅ๏ธ
Also calling out AWQ and GPTQ on not supporting multi-GPU quantisation! ๐ฅ๏ธโก
God sent @casperhansen has posted AWQ quantised INT4 model (68.68 GB) with the perplexity of 2.889: casperhansen/mistral-large-instruct-2407-awq ๐ฅ๐
Looks like open AI is going to beat OpenAI! ๐๐ค
Blog post: https://mistral.ai/news/mistral-large-2407/
Models: mistralai/Mistral-Large-Instruct-2407
- Mistral-Large-Instruct-2407, just call it Mistral-Large2, is a 123B parameters Instruct model with 128k context ๐๐
- Multilingual in 11 languages; English ๐ฌ๐ง, French ๐ซ๐ท, German ๐ฉ๐ช, Spanish ๐ช๐ธ, Italian ๐ฎ๐น, Chinese ๐จ๐ณ, Japanese ๐ฏ๐ต, Korean ๐ฐ๐ท, Portuguese ๐ต๐น, Dutch ๐ณ๐ฑ, and Polish ๐ต๐ฑ. ๐ฃ๏ธ๐บ๏ธ
- Also highly focused on programming, trained on 80+ coding languages such as Python, Java, C, C++, Javascript, bash ๐ป๐ง
- Supports native function calling and structured output. ๐ ๏ธ๐
- Released under Mistral Research License (Non-Commercial License, Research only๐)
- Open weights only๐, no data or code released ๐๐
Definitely firing shots at @Meta Llama3.1: ๐ฏ๐ฅ
MMLU - 84.0% (ML2) vs 79.3% (L3.1-70B) vs 85.2% (L3.1-405B)
GSM8K - 93% (ML2) vs 95.5% (L3.1-70B-Ins) vs 96.8% (L3.1-405B-Ins)
Also, it's kinda chunky! ๐ฆ๐ช
fp16/ bf16 - ~250GB VRAM
fp8/ int8 - ~125GB VRAM
int4 - ~60GB VRAM
I tried quantising it to AWQ and GPTQ, but couldn't with 30GB V-RAM. โ๐ฅ๏ธ
Also calling out AWQ and GPTQ on not supporting multi-GPU quantisation! ๐ฅ๏ธโก
God sent @casperhansen has posted AWQ quantised INT4 model (68.68 GB) with the perplexity of 2.889: casperhansen/mistral-large-instruct-2407-awq ๐ฅ๐
Looks like open AI is going to beat OpenAI! ๐๐ค
Blog post: https://mistral.ai/news/mistral-large-2407/
Models: mistralai/Mistral-Large-Instruct-2407