InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation Paper • 2404.19427 • Published 2 days ago • 45
ZeroGPU Spaces Collection ZeroGPU Spaces made by the community • 15 items • Updated about 16 hours ago • 69
GreenBitAI MLX LLM Collection GreenBitAI's Low-bit LLMs in MLX format • 69 items • Updated 1 day ago • 2
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models Paper • 2404.18796 • Published 3 days ago • 47
AutoCrawler: A Progressive Understanding Web Agent for Web Crawler Generation Paper • 2404.12753 • Published 13 days ago • 36
view article Article Expanding Model Context and Creating Chat Models with a Single Click By maywell • 4 days ago • 22
view article Article ⚗️ 🧑🏼🌾 Let's grow some Domain Specific Datasets together By burtenshaw • 3 days ago • 22
PLLaVA : Parameter-free LLaVA Extension from Images to Videos for Video Dense Captioning Paper • 2404.16994 • Published 7 days ago • 28
How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites Paper • 2404.16821 • Published 7 days ago • 47
PuLID: Pure and Lightning ID Customization via Contrastive Alignment Paper • 2404.16022 • Published 8 days ago • 15
CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data Paper • 2404.15653 • Published 8 days ago • 20
LLaVA++ (LLaMA-3 and Phi-3-Mini) Collection Extending Visual Capabilities of LLaVA with LLaMA-3 and Phi-3 • 11 items • Updated 2 days ago • 20
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated 14 days ago • 449
How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study Paper • 2404.14047 • Published 10 days ago • 37
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework Paper • 2404.14619 • Published 10 days ago • 106
view article Article LLM Comparison/Test: Llama 3 Instruct 70B + 8B HF/GGUF/EXL2 (20 versions tested and compared!) By wolfram • 8 days ago • 31
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published 10 days ago • 224
view article Article seemore: Implement a Vision Language Model from Scratch By AviSoori1x • about 21 hours ago • 36
TextSquare: Scaling up Text-Centric Visual Instruction Tuning Paper • 2404.12803 • Published 13 days ago • 27
VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time Paper • 2404.10667 • Published 16 days ago • 9
view article Article Introducing Idefics2: A Powerful 8B Vision-Language Model for the community 18 days ago • 93
Idefics2 🐶 Collection Idefics2-8B is a foundation vision-language model. In this collection, you will find the models, datasets and demo related to its creation. • 8 items • Updated 15 days ago • 58
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention Paper • 2404.07143 • Published 22 days ago • 90
ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback Paper • 2404.07987 • Published 21 days ago • 45
Transformers.js demos Collection A collection of my favorite WebML demos, built with Transformers.js! • 22 items • Updated 8 days ago • 29
view article Article Building a Neural Network Classifier from the Ground Up: A Step-by-Step Guide By dcarpintero • 21 days ago • 4
view article Article It's raining diffusion personalization techniques☔️🎭🖼️ By linoyts • 21 days ago • 15
view article Article DS-MoE: Making MoE Models More Efficient and Less Memory-Intensive By bpan • 23 days ago • 25
view article Article A guide to setting up your own Hugging Face leaderboard: an end-to-end example with Vectara's hallucination leaderboard Jan 12 • 3
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models Paper • 2404.02258 • Published 30 days ago • 98
Llama2-7B HQQ+ Collection Extreme low-bit quantization with HQQ+ (HQQ + LoRA adapter) • 3 items • Updated 13 days ago • 14
Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models Paper • 2403.18814 • Published Mar 27 • 37
Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion Distillation Paper • 2403.12015 • Published Mar 18 • 60
Mora: Enabling Generalist Video Generation via A Multi-Agent Framework Paper • 2403.13248 • Published Mar 20 • 71