Kuldeep Singh Sidhu's picture
3 3

Kuldeep Singh Sidhu

singhsidhukuldeep

AI & ML interests

None yet

Organizations

Posts 39

view post
Post
332
When @MistralAI drops a blog post labelled "Large Enough," it's going to get serious! ๐Ÿš€๐Ÿ’ก

- Mistral-Large-Instruct-2407, just call it Mistral-Large2, is a 123B parameters Instruct model with 128k context ๐ŸŒ๐Ÿ“š

- Multilingual in 11 languages; English ๐Ÿ‡ฌ๐Ÿ‡ง, French ๐Ÿ‡ซ๐Ÿ‡ท, German ๐Ÿ‡ฉ๐Ÿ‡ช, Spanish ๐Ÿ‡ช๐Ÿ‡ธ, Italian ๐Ÿ‡ฎ๐Ÿ‡น, Chinese ๐Ÿ‡จ๐Ÿ‡ณ, Japanese ๐Ÿ‡ฏ๐Ÿ‡ต, Korean ๐Ÿ‡ฐ๐Ÿ‡ท, Portuguese ๐Ÿ‡ต๐Ÿ‡น, Dutch ๐Ÿ‡ณ๐Ÿ‡ฑ, and Polish ๐Ÿ‡ต๐Ÿ‡ฑ. ๐Ÿ—ฃ๏ธ๐Ÿ—บ๏ธ

- Also highly focused on programming, trained on 80+ coding languages such as Python, Java, C, C++, Javascript, bash ๐Ÿ’ป๐Ÿ”ง

- Supports native function calling and structured output. ๐Ÿ› ๏ธ๐Ÿ“Š

- Released under Mistral Research License (Non-Commercial License, Research only๐Ÿ˜”)

- Open weights only๐Ÿ”“, no data or code released ๐Ÿ”’๐Ÿ“

Definitely firing shots at @Meta Llama3.1: ๐ŸŽฏ๐Ÿ”ฅ
MMLU - 84.0% (ML2) vs 79.3% (L3.1-70B) vs 85.2% (L3.1-405B)
GSM8K - 93% (ML2) vs 95.5% (L3.1-70B-Ins) vs 96.8% (L3.1-405B-Ins)

Also, it's kinda chunky! ๐Ÿ“ฆ๐Ÿ’ช
fp16/ bf16 - ~250GB VRAM
fp8/ int8 - ~125GB VRAM
int4 - ~60GB VRAM

I tried quantising it to AWQ and GPTQ, but couldn't with 30GB V-RAM. โŒ๐Ÿ–ฅ๏ธ

Also calling out AWQ and GPTQ on not supporting multi-GPU quantisation! ๐Ÿ–ฅ๏ธโšก

God sent @casperhansen has posted AWQ quantised INT4 model (68.68 GB) with the perplexity of 2.889: casperhansen/mistral-large-instruct-2407-awq ๐Ÿ”ฅ๐Ÿ‘

Looks like open AI is going to beat OpenAI! ๐Ÿ†๐Ÿค–

Blog post: https://mistral.ai/news/mistral-large-2407/

Models: mistralai/Mistral-Large-Instruct-2407
view post
Post
1217
Yet another post hailing how good Meta Llama 3.1 is? ๐Ÿค” I guess not!

While Llama 3.1 is truly impressive, especially 405B (which gives GPT-4o a run for its money! ๐Ÿ’ช)

I was surprised to see that on the Open LLM Leaderboard, Llama 3.1 70B was not able to dethrone the current king Qwen2-72B! ๐Ÿ‘‘

Not only that, for a few benchmarks like MATH Lvl 5, it was completely lagging behind Qwen2-72B! ๐Ÿ“‰

Also, the benchmarks are completely off compared to the official numbers from Meta! ๐Ÿคฏ

Based on the responses, I still believe Llama 3.1 will perform better than Qwen2 on LMSYS Chatbot Arena. ๐Ÿค– But it still lags behind on too many benchmarks! ๐Ÿƒโ€โ™‚๏ธ

Open LLM Leaderboard: open-llm-leaderboard/open_llm_leaderboard ๐ŸŒ

Hopefully, this is just an Open LLM Leaderboard error! @open-llm-leaderboard SOS! ๐Ÿšจ

models

None public yet

datasets

None public yet