1 2 11

Lathashree

lathashree01

lathashree01

AI & ML interests

NLP, Computer Vision, Gen AI for Healthcare

Recent Activity

liked a Space 21 days ago

nanotron/ultrascale-playbook

reacted to thomwolf's post with 🔥 23 days ago

reacted to chansung's post with 👍 23 days ago

Gemma 3 Release in a nutshell (seems like function calling is not supported whereas the announcement said so)

View all activity

Organizations

lathashree01's activity

liked a Space 21 days ago

2.44k

The Ultra-Scale Playbook

🌌

The ultimate guide to training LLM on large GPU Clusters

reacted to thomwolf's post with 🔥 23 days ago

Post

2782

We've kept pushing our Open-R1 project, an open initiative to replicate and extend the techniques behind DeepSeek-R1.

And even we were mind-blown by the results we got with this latest model we're releasing: ⚡️OlympicCoder ( open-r1/OlympicCoder-7B and open-r1/OlympicCoder-32B)

It's beating Claude 3.7 on (competitive) programming –a domain Anthropic has been historically really strong at– and it's getting close to o1-mini/R1 on olympiad level coding with just 7B parameters!

And the best part is that we're open-sourcing all about its training dataset, the new IOI benchmark, and more in our Open-R1 progress report #3: https://huggingface.co/blog/open-r1/update-3

Datasets are are releasing:
- open-r1/codeforces
- open-r1/codeforces-cots
- open-r1/ioi
- open-r1/ioi-test-cases
- open-r1/ioi-sample-solutions
- open-r1/ioi-cots
- open-r1/ioi-2024-model-solutions

reacted to chansung's post with 👍 23 days ago

Post

1567

Gemma 3 Release in a nutshell
(seems like function calling is not supported whereas the announcement said so)

upvoted a collection about 1 month ago

DeepSeek-R1

Collection

8 items • Updated Jan 21 • 602

liked a model about 1 month ago

deepseek-ai/DeepSeek-R1

Text Generation • Updated 15 days ago • 1.37M • • 11.9k

updated a Space about 2 months ago

Weather Agent

☁

A weather agent built as part of Agents course

New activity in lathashree01/my_awesome_wnut_model about 2 months ago

Adding `safetensors` variant of this model

#1 opened about 2 months ago by

SFconvertbot

reacted to julien-c's post with 🔥 4 months ago

Post

10537

After some heated discussion 🔥, we clarify our intent re. storage limits on the Hub

TL;DR:
- public storage is free, and (unless blatant abuse) unlimited. We do ask that you consider upgrading to PRO and/or Enterprise Hub if possible
- private storage is paid above a significant free tier (1TB if you have a paid account, 100GB otherwise)

docs: https://huggingface.co/docs/hub/storage-limits

We optimize our infrastructure continuously to scale our storage for the coming years of growth in Machine learning, to the benefit of the community 🔥

cc: @reach-vb @pierric @victor and the HF team

28 replies

liked a model 6 months ago

nvidia/Llama-3.1-Nemotron-70B-Instruct-HF

Text Generation • Updated Oct 25, 2024 • 67.8k • • 2.03k

upvoted an article 9 months ago

Article

SmolLM - blazingly fast and remarkably powerful

Jul 16, 2024

• 352

reacted to rwightman's post with 🔥 10 months ago

Post

2465

MobileNetV4 weights are now in timm! So far these are the only weights for these models as the offiicial Tensorflow impl remains weightless.

Guided by paper hparams with a few tweaks, I've managed to match or beat the paper results training the medium models. I'm still working on large and improving the small result. They appear to be solid models for on-device use.

timm/mobilenetv4-pretrained-weights-6669c22cda4db4244def9637

MobileNetV4 -- Universal Models for the Mobile Ecosystem (2404.10518)