Sudaksh Soti

sudaksh

AI & ML interests

None yet

Recent Activity

reacted to merve's post with ❤️ 5 months ago

This week in open-source AI was insane 🤠 A small recap🕺🏻 https://huggingface.co/collections/merve/dec-6-releases-67545caebe9fc4776faac0a3 Multimodal 🖼️ > Google shipped a PaliGemma 2, new iteration of PaliGemma with more sizes: 3B, 10B and 28B, with pre-trained and captioning variants 👏 > OpenGVLab released InternVL2, seven new vision LMs in different sizes, with sota checkpoint with MIT license ✨ > Qwen team at Alibaba released the base models of Qwen2VL models with 2B, 7B and 72B ckpts LLMs 💬 > Meta released a new iteration of Llama 70B, Llama3.2-70B trained further > EuroLLM-9B-Instruct is a new multilingual LLM for European languages with Apache 2.0 license 🔥 > Dataset: CohereForAI released GlobalMMLU, multilingual version of MMLU with 42 languages with Apache 2.0 license > Dataset: QwQ-LongCoT-130K is a new dataset to train reasoning models > Dataset: FineWeb2 just landed with multilinguality update! 🔥 nearly 8TB pretraining data in many languages! Image/Video Generation 🖼️ > Tencent released HunyuanVideo, a new photorealistic video generation model > OminiControl is a new editing/control framework for image generation models like Flux Audio 🔊 > Indic-Parler-TTS is a new text2speech model made by community

liked a model 5 months ago

microsoft/Phi-3.5-mini-instruct

liked a model 5 months ago

microsoft/trocr-base-handwritten

View all activity

Organizations

sudaksh's activity

reacted to merve's post with ❤️ 5 months ago

Post

5662

This week in open-source AI was insane 🤠 A small recap🕺🏻 merve/dec-6-releases-67545caebe9fc4776faac0a3

Multimodal 🖼️
> Google shipped a PaliGemma 2, new iteration of PaliGemma with more sizes: 3B, 10B and 28B, with pre-trained and captioning variants 👏
> OpenGVLab released InternVL2, seven new vision LMs in different sizes, with sota checkpoint with MIT license ✨
> Qwen team at Alibaba released the base models of Qwen2VL models with 2B, 7B and 72B ckpts

LLMs 💬
> Meta released a new iteration of Llama 70B, Llama3.2-70B trained further
> EuroLLM-9B-Instruct is a new multilingual LLM for European languages with Apache 2.0 license 🔥
> Dataset: CohereForAI released GlobalMMLU, multilingual version of MMLU with 42 languages with Apache 2.0 license
> Dataset: QwQ-LongCoT-130K is a new dataset to train reasoning models
> Dataset: FineWeb2 just landed with multilinguality update! 🔥 nearly 8TB pretraining data in many languages!

Image/Video Generation 🖼️
> Tencent released HunyuanVideo, a new photorealistic video generation model
> OminiControl is a new editing/control framework for image generation models like Flux

Audio 🔊
> Indic-Parler-TTS is a new text2speech model made by community

liked 2 models 5 months ago

microsoft/Phi-3.5-mini-instruct

Text Generation • Updated Mar 2 • 321k • • 852

microsoft/trocr-base-handwritten

Image-to-Text • Updated Feb 11 • 192k • • 403