Sayak Paul's picture

Sayak Paul

sayakpaul

·

https://sayak.dev

AI & ML interests

Diffusion models, representation learning

Recent Activity

upvoted a collection 2 days ago

published a model 2 days ago

sayakpaul/subject200k-lr_1e-4-wd_1e-4-gs_30.0-cd_0.0-scheduler_constant-sim_flow-no8bitadam

new activity 2 days ago

Yuanshi/Subjects200K_collection3:Padding value

View all activity

Organizations

sayakpaul's activity

upvoted a collection 2 days ago

SANA-1.5

SANA-1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer • 6 items • Updated 7 days ago • 2

upvoted an article 7 days ago

Article

Don't repeat yourself - 🤗 Transformers Design Philosophy

Apr 5, 2022

• 24

upvoted an article 14 days ago

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

15 days ago

• 345

upvoted an article 26 days ago

Article

Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models

Jun 24, 2024

• 190

upvoted a collection 28 days ago

Remote VAE Inference Endpoints

Models and handler code used in https://huggingface.co/blog/remote_vae • 5 items • Updated 16 days ago • 4

upvoted 2 articles about 1 month ago

Article

Remote VAEs for decoding with HF endpoints 🤗

about 1 month ago

• 37

Article

SigLIP 2: A better multilingual vision language encoder

Feb 21

• 145

upvoted a paper about 1 month ago

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published Feb 20 • 138

upvoted a collection about 1 month ago

PaliGemma 2 Release

Vision-Language Models available in multiple 3B, 10B and 28B variants. • 32 items • Updated 14 days ago • 145

upvoted 2 articles about 1 month ago

Article

Build awesome datasets for video generation

Feb 12

• 29

Article

Open-source DeepResearch – Freeing our search agents

Feb 4

• 1.19k

upvoted an article about 2 months ago

Article

The AI tools for Art Newsletter - Issue 1

Jan 31

• 70

upvoted an article 2 months ago

Article

Timm ❤️ Transformers: Use any timm model with transformers

Jan 16

• 44

upvoted a paper 3 months ago

LTX-Video: Realtime Video Latent Diffusion

Paper • 2501.00103 • Published Dec 30, 2024 • 46

upvoted an article 5 months ago

Article

🧨 Diffusers welcomes Stable Diffusion 3.5 Large

Oct 22, 2024

• 50

upvoted a paper 6 months ago

CogView3: Finer and Faster Text-to-Image Generation via Relay Diffusion

Paper • 2403.05121 • Published Mar 8, 2024 • 24

upvoted a paper 7 months ago

Enhancing Training Efficiency Using Packing with Flash Attention

Paper • 2407.09105 • Published Jul 12, 2024 • 15