Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
singhsidhukuldeepย 
posted an update 1 day ago
Post
380
When @MistralAI drops a blog post labelled "Large Enough," it's going to get serious! ๐Ÿš€๐Ÿ’ก

- Mistral-Large-Instruct-2407, just call it Mistral-Large2, is a 123B parameters Instruct model with 128k context ๐ŸŒ๐Ÿ“š

- Multilingual in 11 languages; English ๐Ÿ‡ฌ๐Ÿ‡ง, French ๐Ÿ‡ซ๐Ÿ‡ท, German ๐Ÿ‡ฉ๐Ÿ‡ช, Spanish ๐Ÿ‡ช๐Ÿ‡ธ, Italian ๐Ÿ‡ฎ๐Ÿ‡น, Chinese ๐Ÿ‡จ๐Ÿ‡ณ, Japanese ๐Ÿ‡ฏ๐Ÿ‡ต, Korean ๐Ÿ‡ฐ๐Ÿ‡ท, Portuguese ๐Ÿ‡ต๐Ÿ‡น, Dutch ๐Ÿ‡ณ๐Ÿ‡ฑ, and Polish ๐Ÿ‡ต๐Ÿ‡ฑ. ๐Ÿ—ฃ๏ธ๐Ÿ—บ๏ธ

- Also highly focused on programming, trained on 80+ coding languages such as Python, Java, C, C++, Javascript, bash ๐Ÿ’ป๐Ÿ”ง

- Supports native function calling and structured output. ๐Ÿ› ๏ธ๐Ÿ“Š

- Released under Mistral Research License (Non-Commercial License, Research only๐Ÿ˜”)

- Open weights only๐Ÿ”“, no data or code released ๐Ÿ”’๐Ÿ“

Definitely firing shots at @Meta Llama3.1: ๐ŸŽฏ๐Ÿ”ฅ
MMLU - 84.0% (ML2) vs 79.3% (L3.1-70B) vs 85.2% (L3.1-405B)
GSM8K - 93% (ML2) vs 95.5% (L3.1-70B-Ins) vs 96.8% (L3.1-405B-Ins)

Also, it's kinda chunky! ๐Ÿ“ฆ๐Ÿ’ช
fp16/ bf16 - ~250GB VRAM
fp8/ int8 - ~125GB VRAM
int4 - ~60GB VRAM

I tried quantising it to AWQ and GPTQ, but couldn't with 30GB V-RAM. โŒ๐Ÿ–ฅ๏ธ

Also calling out AWQ and GPTQ on not supporting multi-GPU quantisation! ๐Ÿ–ฅ๏ธโšก

God sent @casperhansen has posted AWQ quantised INT4 model (68.68 GB) with the perplexity of 2.889: casperhansen/mistral-large-instruct-2407-awq ๐Ÿ”ฅ๐Ÿ‘

Looks like open AI is going to beat OpenAI! ๐Ÿ†๐Ÿค–

Blog post: https://mistral.ai/news/mistral-large-2407/

Models: mistralai/Mistral-Large-Instruct-2407