Shetu Mohanto

shetumohanto
Β·

AI & ML interests

GenAI | MLOps | AI agent | Computer Vision

Recent Activity

View all activity

Organizations

Hugging Face Discord Community's profile picture open/ acc's profile picture Smol Community's profile picture

shetumohanto's activity

reacted to prithivMLmods's post with πŸ”₯ 1 day ago
view post
Post
5818
Dropping some of the custom fine-tunes based on SigLIP2,
with a single-label classification problem type! πŸŒ€πŸ§€

- AI vs Deepfake vs Real : prithivMLmods/AI-vs-Deepfake-vs-Real-Siglip2
- Deepfake Detect : prithivMLmods/Deepfake-Detect-Siglip2
- Fire Detection : prithivMLmods/Fire-Detection-Siglip2
- Deepfake Quality Assess : prithivMLmods/Deepfake-Quality-Assess-Siglip2
- Guard Against Unsafe Content : prithivMLmods/Guard-Against-Unsafe-Content-Siglip2

🌠Collection : prithivMLmods/siglip2-custom-67bcdb2de8fe96b99fb4e19e
reacted to clem's post with πŸ‘πŸ”₯ 11 days ago
view post
Post
2812
What are the best organizations to follow on @huggingface ?

On top of my head:
- Deepseek (35,000 followers): https://huggingface.co/deepseek-ai
- Meta Llama (27,000 followers): https://huggingface.co/meta-llama
- Black Forrest Labs (11,000 followers): https://huggingface.co/black-forest-labs
- OpenAI (5,000 followers): https://huggingface.co/openai
- Nvidia (16,000 followers): https://huggingface.co/nvidia
- MIcrosoft (9,000 followers): https://huggingface.co/microsoft
- AllenAI (2,000 followers): https://huggingface.co/allenai
- Mistral (5,000 followers): https://huggingface.co/mistralai
- XAI (600 followers): https://huggingface.co/xai-org
- Stability AI (16,000 followers): https://huggingface.co/stabilityai
- Qwen (16,000 followers): https://huggingface.co/Qwen
- GoogleAI (8,000 followers): https://huggingface.co/google
- Unsloth (3,000 followers): https://huggingface.co/unsloth
- Bria AI (4,000 followers): https://huggingface.co/briaai
- NousResearch (1,300 followers): https://huggingface.co/NousResearch

Bonus, the agent course org with 17,000 followers: https://huggingface.co/agents-course
  • 1 reply
Β·
reacted to mmhamdy's post with ❀️ 11 days ago
view post
Post
2728
πŸŽ‰ We're excited to introduce MemoryCode, a novel synthetic dataset designed to rigorously evaluate LLMs' ability to track and execute coding instructions across multiple sessions. MemoryCode simulates realistic workplace scenarios where a mentee (the LLM) receives coding instructions from a mentor amidst a stream of both relevant and irrelevant information.

πŸ’‘ But what makes MemoryCode unique?! The combination of the following:

βœ… Multi-Session Dialogue Histories: MemoryCode consists of chronological sequences of dialogues between a mentor and a mentee, mirroring real-world interactions between coworkers.

βœ… Interspersed Irrelevant Information: Critical instructions are deliberately interspersed with unrelated content, replicating the information overload common in office environments.

βœ… Instruction Updates: Coding rules and conventions can be updated multiple times throughout the dialogue history, requiring LLMs to track and apply the most recent information.

βœ… Prospective Memory: Unlike previous datasets that cue information retrieval, MemoryCode requires LLMs to spontaneously recall and apply relevant instructions without explicit prompts.

βœ… Practical Task Execution: LLMs are evaluated on their ability to use the retrieved information to perform practical coding tasks, bridging the gap between information recall and real-world application.

πŸ“Œ Our Findings

1️⃣ While even small models can handle isolated coding instructions, the performance of top-tier models like GPT-4o dramatically deteriorates when instructions are spread across multiple sessions.

2️⃣ This performance drop isn't simply due to the length of the context. Our analysis indicates that LLMs struggle to reason compositionally over sequences of instructions and updates. They have difficulty keeping track of which instructions are current and how to apply them.

πŸ”— Paper: From Tools to Teammates: Evaluating LLMs in Multi-Session Coding Interactions (2502.13791)
πŸ“¦ Code: https://github.com/for-ai/MemoryCode
reacted to csabakecskemeti's post with πŸ€— 11 days ago
view post
Post
2746
Testing Training on AMD/ROCm the first time!

I've got my hands on an AMD Instinct MI100. It's about the same price used as a V100 but on paper has more TOPS (V100 14TOPS vs MI100 23TOPS) also the HBM has faster clock so the memory bandwidth is 1.2TB/s.
For quantized inference it's a beast (MI50 was also surprisingly fast)

For LORA training with this quick test I could not make the bnb config works so I'm running the FT on the fill size model.

Will share all the install, setup and setting I've learned in a blog post, together with the cooling shroud 3D design.
Β·
reacted to davidberenstein1957's post with πŸ”₯❀️ 3 months ago
view post
Post
4234
Introducing the Synthetic Data Generator, a user-friendly application that takes a no-code approach to creating custom datasets with Large Language Models (LLMs). The best part: A simple step-by-step process, making dataset creation a non-technical breeze, allowing anyone to create datasets and models in minutes and without any code.

Blog: https://huggingface.co/blog/synthetic-data-generator
Space: argilla/synthetic-data-generator
  • 4 replies
Β·
reacted to lewtun's post with πŸ”₯❀️ 3 months ago
view post
Post
6928
We outperform Llama 70B with Llama 3B on hard math by scaling test-time compute πŸ”₯

How? By combining step-wise reward models with tree search algorithms :)

We show that smol models can match or exceed the performance of their much larger siblings when given enough "time to think"

We're open sourcing the full recipe and sharing a detailed blog post.

In our blog post we cover:

πŸ“ˆ Compute-optimal scaling: How we implemented DeepMind's recipe to boost the mathematical capabilities of open models at test-time.

πŸŽ„ Diverse Verifier Tree Search (DVTS): An unpublished extension we developed to the verifier-guided tree search technique. This simple yet effective method improves diversity and delivers better performance, particularly at large test-time compute budgets.

🧭 Search and Learn: A lightweight toolkit for implementing search strategies with LLMs and built for speed with vLLM

Here's the links:

- Blog post: HuggingFaceH4/blogpost-scaling-test-time-compute

- Code: https://github.com/huggingface/search-and-learn

Enjoy!
  • 2 replies
Β·
reacted to aaditya's post with ❀️ 3 months ago
view post
Post
3437
Last Week in Medical AI: Top Research Papers/Models πŸ”₯
πŸ… (December 7 – December 14, 2024)

Medical LLM & Other Models
- PediaBench: Chinese Pediatric LLM
- Comprehensive pediatric dataset
- Advanced benchmarking platform
- Chinese healthcare innovation
- BiMediX: Bilingual Medical LLM
- Multilingual medical expertise
- Diverse medical knowledge integration
- Cross-cultural healthcare insights
- MMedPO: Vision-Language Medical LLM
- Clinical multimodal optimization
- Advanced medical image understanding
- Precision healthcare modeling

Frameworks and Methodologies
- TOP-Training: Medical Q&A Framework
- Hybrid RAG: Secure Medical Data Management
- Zero-Shot ATC Clinical Coding
- Chest X-Ray Diagnosis Architecture
- Medical Imaging AI Democratization

Benchmarks & Evaluations
- KorMedMCQA: Korean Healthcare Licensing Benchmark
- Large Language Model Medical Tasks
- Clinical T5 Model Performance Study
- Radiology Report Quality Assessment
- Genomic Analysis Benchmarking

Medical LLM Applications
- BRAD: Digital Biology Language Model
- TCM-FTP: Herbal Prescription Prediction
- LLaSA: Activity Analysis via Sensors
- Emergency Department Visit Predictions
- Neurodegenerative Disease AI Diagnosis
- Kidney Disease Explainable AI Model

Ethical AI & Privacy
- Privacy-Preserving LLM Mechanisms
- AI-Driven Digital Organism Modeling
- Biomedical Research Automation
- Multimodality in Medical Practice

Full thread in detail: https://x.com/OpenlifesciAI/status/1867999825721242101
Β·
reacted to lewtun's post with ❀️ 3 months ago
view post
Post
6928
We outperform Llama 70B with Llama 3B on hard math by scaling test-time compute πŸ”₯

How? By combining step-wise reward models with tree search algorithms :)

We show that smol models can match or exceed the performance of their much larger siblings when given enough "time to think"

We're open sourcing the full recipe and sharing a detailed blog post.

In our blog post we cover:

πŸ“ˆ Compute-optimal scaling: How we implemented DeepMind's recipe to boost the mathematical capabilities of open models at test-time.

πŸŽ„ Diverse Verifier Tree Search (DVTS): An unpublished extension we developed to the verifier-guided tree search technique. This simple yet effective method improves diversity and delivers better performance, particularly at large test-time compute budgets.

🧭 Search and Learn: A lightweight toolkit for implementing search strategies with LLMs and built for speed with vLLM

Here's the links:

- Blog post: HuggingFaceH4/blogpost-scaling-test-time-compute

- Code: https://github.com/huggingface/search-and-learn

Enjoy!
  • 2 replies
Β·
reacted to di-zhang-fdu's post with πŸš€ 3 months ago
view post
Post
3070
  • 3 replies
Β·
reacted to sayakpaul's post with πŸ”₯ 3 months ago
reacted to hexgrad's post with πŸ”₯ 3 months ago
view post
Post
3026
self.brag(): Kokoro finally got 300 votes in Pendrokar/TTS-Spaces-Arena after @Pendrokar was kind enough to add it 3 weeks ago.
Discounting the small sample size of votes, I think it is safe to say that hexgrad/Kokoro-TTS is currently a top 3 model among the contenders in that Arena. This is notable because:
- At 82M params, Kokoro is one of the smaller models in the Arena
- MeloTTS has 52M params
- F5 TTS has 330M params
- XTTSv2 has 467M params
Β·
reacted to merve's post with πŸ‘ 3 months ago
view post
Post
2218
The authors of ColPali trained a retrieval model based on SmolVLM 🀠 https://huggingface.co/vidore/colsmolvlm-alpha
TLDR;

- ColSmolVLM performs better than ColPali and DSE-Qwen2 on all English tasks

- ColSmolVLM is more memory efficient than ColQwen2 πŸ’—
reacted to andito's post with πŸ”₯❀️ 3 months ago
view post
Post
3370
Let's go! We are releasing SmolVLM, a smol 2B VLM built for on-device inference that outperforms all models at similar GPU RAM usage and tokens throughputs.

- SmolVLM generates tokens 7.5 to 16 times faster than Qwen2-VL! 🀯
- Other models at this size crash a laptop, but SmolVLM comfortably generates 17 tokens/sec on a macbook! πŸš€
- SmolVLM can be fine-tuned on a Google collab! Or process millions of documents with a consumer GPU!
- SmolVLM even outperforms larger models in video benchmarks, despite not even being trained on videos!

Check out more!
Demo: HuggingFaceTB/SmolVLM
Blog: https://huggingface.co/blog/smolvlm
Model: HuggingFaceTB/SmolVLM-Instruct
Fine-tuning script: https://github.com/huggingface/smollm/blob/main/finetuning/Smol_VLM_FT.ipynb
reacted to openfree's post with ❀️ 3 months ago
view post
Post
3241
Hackathon: 1-Minute Creative Innovation with AI
Total Prize: 20,000 USD(USDT)

"One-minute creation by AI Coding Autonomous Agent MOUSE-I"
Hosted by VIDraft | Organized by Korea AI Promotion Association (KAIPA)

🌟 Revolutionary Era of AI Coding
"Creating a web app in just one minute" - This is no longer just imagination, but reality. With the emergence of AI Coding Autonomous Agent MOUSE-I ("One-minute creation by AI Coding Autonomous Agent MOUSE-I"), we are witnessing a new era of software development.

πŸ† Period & Prizes
Period: November 28 - December 23, 2024

Total Prize: 20,000 USD(USDT)

πŸ†
Top Rank: 10,000 USDT
Highest HuggingFace Trending Rank

❀️
Top Likes: 5,000 USDT
Most Likes

πŸ’«
Top Creative: 5,000 USDT
Most Innovative

πŸš€ Participation Process
1. Start with MOUSE-I
β€’ Access https://VIDraft-mouse1.hf.space
β€’ Notice VIDraft/Mouse-Hackathon
β€’ Generate basic web app code in 1 minute
β€’ Create unlimited content: games, dashboards, landing pages, utilities, etc.

2. Creative Development
β€’ Free development based on MOUSE-I generated code
β€’ Additional languages like Python can be used

3. Submission
β€’ Public deployment on Hugging Face
β€’ Register in Static mode
β€’ Required in README.md:

short_description: "One-minute creation by AI Coding Autonomous Agent MOUSE-I"

πŸ“… Key Dates
β€’ Submission Deadline: December 23, 2024, midnight (NYC time)
β€’ Winners Announcement: December 24, 2024

✨ Participant Benefits
β€’ Full ownership and copyright of all creations
β€’ Experience new paradigm of AI coding
β€’ Multiple submissions allowed from the same account
β€’ Contact: arxivgpt@gmail.com

"Give Yourself the Best Christmas Gift
reacted to as-cle-bert's post with πŸ”₯ 3 months ago
view post
Post
1272
Hi HuggingFacers!πŸ€—
I'm thrilled to introduce my latest project: π—¦π—²π—»π—§π—Ώπ—˜π˜ƒ (𝗦𝗲𝗻tence 𝗧𝗿ansformers π—˜π˜ƒaluator), a python package that offers simple customizable evaluation for text retrieval accuracy and time performance of Sentence Transformers-compatible text embedders on PDF data!πŸ“Š

Learn more in my LinkedIn post: https://www.linkedin.com/posts/astra-clelia-bertelli-583904297_python-embedders-semanticsearch-activity-7266754133557190656-j1e3

And on the GitHub repo: https://github.com/AstraBert/SenTrEv

Have fun!πŸ•