s3nh

s3nh

AI & ML interests

Quantization, LLMs, Deep Learning for good. Follow me if you like my work. Patreon.com/s3nh

Recent Activity

reacted to ArthurZ's post with 🔥 2 days ago
reacted to reach-vb's post with 🔥 3 days ago
reacted to Walmart-the-bag's post with 🔥 3 days ago

Organizations

s3nh's activity

reacted to ArthurZ's post with 🔥 2 days ago
reacted to reach-vb's post with 🔥 3 days ago
view post
Post
3911
What a brilliant week for Open Source AI!

Qwen 2.5 Coder by Alibaba - 0.5B / 1.5B / 3B / 7B / 14B/ 32B (Base + Instruct) Code generation LLMs, with 32B tackling giants like Gemnini 1.5 Pro, Claude Sonnet
Qwen/qwen25-coder-66eaa22e6f99801bf65b0c2f

LLM2CLIP from Microsoft - Leverage LLMs to train ultra-powerful CLIP models! Boosts performance over the previous SOTA by ~17%
microsoft/llm2clip-672323a266173cfa40b32d4c

Athene v2 Chat & Agent by NexusFlow - SoTA general LLM fine-tuned from Qwen 2.5 72B excels at Chat + Function Calling/ JSON/ Agents
Nexusflow/athene-v2-6735b85e505981a794fb02cc

Orca Agent Instruct by Microsoft - 1 million instruct pairs covering text editing, creative writing, coding, reading comprehension, etc - permissively licensed
microsoft/orca-agentinstruct-1M-v1

Ultravox by FixieAI - 70B/ 8B model approaching GPT4o level, pick any LLM, train an adapter with Whisper as Audio Encoder
reach-vb/ultravox-audio-language-model-release-67373b602af0a52b2a88ae71

JanusFlow 1.3 by DeepSeek - Next iteration of their Unified MultiModal LLM Janus with RectifiedFlow
deepseek-ai/JanusFlow-1.3B

Common Corpus by Pleais - 2,003,039,184,047 multilingual, commercially permissive and high quality tokens!
PleIAs/common_corpus

I'm sure I missed a lot, can't wait for the next week!

Put down in comments what I missed! 🤗
reacted to Walmart-the-bag's post with 🔥 3 days ago
reacted to prithivMLmods's post with 👍🔥❤️ 3 days ago
view post
Post
5592
reacted to BlinkDL's post with 🔥 3 days ago
view post
Post
2666
RWKV-6-world-v3 (+3.1T tokens) is our best multilingual 7B model as of now: BlinkDL/rwkv-6-world

It's 100% RNN and attention-free. MMLU 54.2% (previous world-v2.1 = 47.9%. note: without eval-boosting tricks such as annealing).

RWKV-7-world-v4 soon :)
reacted to m-ric's post with ❤️🔥 3 days ago
view post
Post
3622
𝗧𝗵𝗲 𝗻𝗲𝘅𝘁 𝗯𝗶𝗴 𝘀𝗼𝗰𝗶𝗮𝗹 𝗻𝗲𝘁𝘄𝗼𝗿𝗸 𝗶𝘀 𝗻𝗼𝘁 🦋, 𝗶𝘁'𝘀 𝗛𝘂𝗯 𝗣𝗼𝘀𝘁𝘀! [INSERT STONKS MEME WITH LASER EYES]

See below: I got 105k impressions since regularly posting Hub Posts, coming close to my 275k on Twitter!

⚙️ Computed with the great dataset maxiw/hf-posts
⚙️ Thanks to Qwen2.5-Coder-32B for showing me how to access dict attributes in a SQL request!

cc @merve who's far in front of me
·
replied to sayakpaul's post 3 days ago
reacted to sayakpaul's post with ❤️ 3 days ago
view post
Post
2068
It's been a while we shipped native quantization support in diffusers 🧨

We currently support bistandbytes as the official backend but using others like torchao is already very simple.

This post is just a reminder of what's possible:

1. Loading a model with a quantization config
2. Saving a model with quantization config
3. Loading a pre-quantized model
4. enable_model_cpu_offload()
5. Training and loading LoRAs into quantized checkpoints

Docs:
https://huggingface.co/docs/diffusers/main/en/quantization/bitsandbytes
  • 1 reply
·
reacted to hexgrad's post with 🔥 3 days ago
reacted to abhishek's post with 🔥 12 days ago
view post
Post
4948
INTRODUCING Hugging Face AutoTrain Client 🔥
Fine-tuning models got even easier!!!!
Now you can fine-tune SOTA models on all compatible dataset-model pairs on Hugging Face Hub using Python on Hugging Face Servers. Choose from a number of GPU flavors, millions of models and dataset pairs and 10+ tasks 🤗

To try, install autotrain-advanced using pip. You can ignore dependencies and install without --no-deps and then you'd need to install some dependencies by hand.

"pip install autotrain-advanced"

Github repo: https://github.com/huggingface/autotrain-advanced
  • 6 replies
·
reacted to prithivMLmods's post with ❤️ 13 days ago
view post
Post
4611
Quintet Drop : : 🤗

{ Flux LoRA DLC ⛵ } : prithivMLmods/FLUX-LoRA-DLC

-- Purple Dreamy
{ pop of color } : prithivMLmods/Purple-Dreamy-Flux-LoRA

-- Golden Dust
{ shimmer contrast } : prithivMLmods/Golden-Dust-Flux-LoRA

-- Lime Green
{ depth to the composition } : prithivMLmods/Lime-Green-Flux-LoRA

-- Flare Strike
{ Fractured Line } : prithivMLmods/Fractured-Line-Flare

-- Orange Chroma
{ studio lighting } : prithivMLmods/Orange-Chroma-Flux-LoRA
.
.
.
{ collection } : prithivMLmods/flux-lora-collections-66dd5908be2206cfaa8519be

@prithivMLmods
reacted to prithivMLmods's post with ❤️🔥👍 14 days ago
view post
Post
4541
New Droppings🥳

😶‍🌫️Collection: prithivMLmods/flux-lora-collections-66dd5908be2206cfaa8519be

🥳Demo Here: prithivMLmods/FLUX-LoRA-DLC with more than 100+ Flux LoRA's

🪨Fluid Dramatic Neon: prithivMLmods/Castor-Dramatic-Neon-Flux-LoRA
🪨Past & Present Blend: prithivMLmods/Past-Present-Deep-Mix-Flux-LoRA
🪨Tarot Cards Refreshed Themes: prithivMLmods/Ton618-Tarot-Cards-Flux-LoRA
🪨Amxtoon Character Mix Real-Anime: prithivMLmods/Ton618-Amxtoon-Flux-LoRA
🪨Epic Realism Flux v1: prithivMLmods/Ton618-Epic-Realism-Flux-LoRA
🪨Mock-up Textures: prithivMLmods/Mockup-Texture-Flux-LoRA
.
.
.
@prithivMLmods 🤗
  • 2 replies
·
reacted to chansung's post with 👍 14 days ago
view post
Post
4450
Effortlessly stay up-to-date with AI research trends using a new AI tool, "AI Paper Reviewer" !!

It analyzes a list of Hugging Face Daily Papers(w/ @akhaliq ) and turn them into insightful blog posts. This project leverages Gemini models (1.5 Pro, 1.5 Flash, and 1.5 Flash-8B) for content generation and Upstage Document Parse for parsing the layout and contents.
blog link: https://deep-diver.github.io/ai-paper-reviewer/

Also, here is the link of GitHub repository for parsing and generating pipeline. By using this, you can easily build your own GitHub static pages based on any arXiv papers with your own interest!
: https://github.com/deep-diver/paper-reviewer
reacted to reach-vb's post with 👀 15 days ago
view post
Post
1296
Smol TTS models are here! OuteTTS-0.1-350M - Zero shot voice cloning, built on LLaMa architecture, CC-BY license! 🔥

> Pure language modeling approach to TTS
> Zero-shot voice cloning
> LLaMa architecture w/ Audio tokens (WavTokenizer)
> BONUS: Works on-device w/ llama.cpp ⚡

Three-step approach to TTS:

> Audio tokenization using WavTokenizer (75 tok per second)
> CTC forced alignment for word-to-audio token mapping
> Structured prompt creation w/ transcription, duration, audio tokens

The model is extremely impressive for 350M parameters! Kudos to the
OuteAI team on such a brilliant feat - I'd love to see this be applied on larger data and smarter backbones like SmolLM 🤗

Check out the models here: OuteAI/outetts-6728aa71a53a076e4ba4817c
reacted to merve's post with 🔥 3 months ago
view post
Post
5506
I have put together a notebook on Multimodal RAG, where we do not process the documents with hefty pipelines but natively use:
- vidore/colpali for retrieval 📖 it doesn't need indexing with image-text pairs but just images!
- Qwen/Qwen2-VL-2B-Instruct for generation 💬 directly feed images as is to a vision language model with no processing to text!
I used ColPali implementation of the new 🐭 Byaldi library by @bclavie 🤗
https://github.com/answerdotai/byaldi
Link to notebook: https://github.com/merveenoyan/smol-vision/blob/main/ColPali_%2B_Qwen2_VL.ipynb